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Abstract 

This  thesis  develops  a  methodology  and  a  tool  for  knowledge  acquisition  with  the 
new  probabilistic  knowledge  representation — ^the  Bayesian  Forest.  It  establishes  the 
structure  of  the  Knowledge  Acquisition  and  Maintenance  module  of  the  Probabilities, 
Expert  Systems,  Knowledge  and  Inference  (peski)  architecture.  The  tool,  mack,  is 
designed  to  be  used  directly  by  the  domain  expert(s)  rather  than  by  knowledge 
engineer(s),  and  thus  supports  automated  knowledge  acquisition. 

This  research  determines  and  implements  the  constraints  necessary  to  ensure  the 
consistency  of  Bayesian  Forest  knowledge  bases  as  data  is  both  acquired  and  subsequently 
maintained.  The  impact  to  the  PESKI  architecture  of  time-dependent  information  and 
default  assumptions  during  reasoning  is  also  explored.  The  tool  has  been  applied  to 
NASA’s  Post-Test  Diagnostics  System  which  locates  anomalies  aboard  the  Space 
Shuttles’  Main  Engines. 


V 


1.  INTRODUCTION 


Automated  reasoning  draws  much  of  its  potency  from  the  way  in  which  its 
knowledge  is  organized  and  stored.  These  knowledge  representations  are  the  underlying 
data  structures  in  electronic  memory  which  can  be  accessed  for  use  by  the  computer. 
They  determine  the  methods  of  computational  inferencing  available  to  the  expert  system 
and,  by  extension,  the  range  of  problems  to  which  automated  reasoning  may  be  applied. 
The  problems  of  acquiring  knowledge  for  the  representation  can  make  the  task  of  building 
a  useful  and  reliable  expert  system  difficult  indeed. 


.  .  .  Flexible  Knowledge  Representation 
There  is  an  inverse  relationship  between  the  efficiency  of  an  automated  reasoning 
algorithm  and  the  flexibility  of  its  associated  knowledge  representation.  However,  the 
former  is  secondary  to  the  overall  capabilities  of  the  final  product  when  weighed  against 
the  latter.  Without  an  appropriate  representation  we  cannot  properly  store  our  knowledge. 
Thus,  system  designers  often  give  preference  to  flexible  representations  [18]. 
Nevertheless,  real-world  applications  for  intelligent  or  expert  systems,  such  as  those  in 
space  operations  domains,  require  a  balance  of  these  two  capabilities.  As  we  shall  see, 
Bayesian  Forests  [49] — an  extension  of  the  more  common  Bayesian  networks — are  just 
such  a  representation  which  can  provide  both  a  flexible  knowledge  structure  and  efficient 
inferencing  algorithms  over  that  representation. 
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The  more  flexible  a  knowledge  representation  is,  the  more  successfully  it  can  model 
the  given  domain.  It  must  "bend"  itself  to  accommodate  the  uncertainties  common  in  life 
and  other  real-world  applications.  For  example,  many  would  agree  with  the  general  "rule 
of  thumb"  that  "All  birds  can  fly."  However,  such  rules  alone  are  insufficient.  Every  rule 
has  its  exceptions.  Penguins,  ostriches  and  the  television  character  Big  Bird,  to  name  a 
few,  cannot  fly.  Thus,  the  statement  "All  birds  can  fly"  is  neither  completely  true  nor 
completely  false.  Its  distance  from  these  extremes  is  its  uncertainty.  While  we  can 
reduce  that  uncertainty  by  the  explicit  incorporation  of  exceptions  into  the  rule,  i.e.,  "All 
birds,  other  than  Big  Bird,  can  fly  unless  they  are  penguins  or  ostriches,"  the  sheer 
number  of  exceptions  often  makes  this  prohibitive.  In  our  avian  example  new  exceptions 
might  include  turkeys,  emus,  hatchlings  and  any  bird  with  a  broken  wing,  for  starters. 

Inference  mechanisms  such  as  those  in  Bayesian  Forests  are  noteworthy  for  their 
ability  to  represent  such  uncertainty  precisely  because  they  marry  the  strong  models  of 
probability  theory  with  an  "if-then"  rule  structure  similar  to  those  rules  which  Shortliffe 
and  Buchanan  [62]  found  most  natural  for  human  experts.  Reliance  on  the  well- 
established  laws  of  probability  helps  guarantee  that  Bayesian  Forests  are  a  sound  and 
consistent  representation  of  knowledge  [53],  [49],  and  therefore,  that  the  results  they 
generate  will  not  be  inconsistent. 


.  .  .  Forests  versus  Networks 
The  line  between  Bayesian  Forests  and  Bayesian  networks  begins  with  notions  of 
flexibility,  as  previously  discussed.  It  continues  into  the  methods  available  to  each  model 
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for  efficient  computation.  Computing  with  Bayesian  networks  is  NP-hard  [12],  [60], 
however,  forests  build  upon  algorithmic  approaches  from  the  field  of  operations  research 
which  demonstrate  polynomial  growth  rates  in  the  expected  case  [53],  [49].  Moreover, 
these  algorithms  can  be  specialized  for  various  domains. 

At  the  lowest  level  the  primary  distinction  between  the  two  Bayesian  reasoning 
representations — Bayesian  Forests  and  Bayesian  networks — is  in  their  respective  treatment 
of  the  random  variables  over  which  they  are  defined.  In  either  reasoning  scheme  these 
variables  represent  objects  or  events.*  The  assignment  of  a  value  to  a  random  variable 
instantiates  it,  and  a  complete  set  of  instantiations,  i.e.,  the  assignment  of  one  value  to 
each  and  every  random  variable,  defines  what  we  call  a  state  of  the  world  [51],  [52],  [54]. 

These  differences  between  Bayesian  networks  and  Forests  are  evident  in  two  areas: 

1)  The  topological  ordering  of  the  random  variables,  and 

2)  The  definition  of  the  conditional  independencies. 

There  is  a  particular  topological  ordering  of  the  random  variables  of  the  Bayesian  network 
and  these  random  variables  define  the  conditional  independencies  within  the  network.  On 
the  other  hand,  the  Bayesian  Forest  maintains  a  topological  ordering  of  the  instantiations 
of  its  random  variables.  Similarly,  the  instantiations  define  the  forest’s  conditional 
independencies. 

In  other  words,  the  random  variable  is  the  Bayesian  network’s  smallest  component 
for  reasoning  purposes.  This  combined  with  the  new  conditional  independencies  also 


^For  our  purposes,  we  also  assume  them  to  be  discrete. 
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frees  the  forest  from  the  necessity  common  to  other  Bayesian  analysis  systems  like 
Bayesian  networks  for  a  complete  specification  of  all  possible  probabilities  [40].  The 
Bayesian  Forest  inference  engine  can  function  with  incomplete  information. 

For  example,  given  an  automobile  and  a  traffic  signal  as  Bayesian  network  random 
variables  and  some  value,  Zj,  between  0  and  1,  Bayesian  network  construction  allows  us 
either  to  predicate  the  automobile’s  motion  upon  the  state  of  the  signal,^ 

P(  car  =  stopped  I  signal  =  red  )  =  Zi  (1-1) 

or  the  current  state  of  the  signal  upon  the  auto, 

P(  signal  =  green  I  car  =  on_detector  )  —  Z2  (1-2) 

but  not  both  as  this  would  preclude  any  topological  ordering  over  these  two  variables.^ 
In  contrast,  the  topological  ordering  implicit  in  a  Bayesian  Forest  is  based  not  upon 
random  variables  alone,  but  upon  the  various  instantiations  that  are  available  for  those 
variables.  Thus,  a  Bayesian  Forest’s  topological  ordering  would  prevent  the  introduction 
of  equation  (1-3)  into  the  knowledge  base  because  of  its  potential  conflict  with  equation 
(1-1)  while  permitting  us  to  include  both  equations  (1-1)  and  (1-2) — ^the  apparently  cyclic 
example  above — ^precisely  because  the  assignments  to  the  variables  differ  [49]. 


^Read:  "The  probability  that  the  car  is  stopped  given  that  the  light  is  red."  "Car  =  stopped"  is  the  head  of  the 
probability;  "light  -  red,"  its  tail.  See  Chapter  2  for  further  details. 

^This  topological  ordering  determines  the  conditional  independencies  between  random  variables  and  becomes 
an  important  factor  in  the  eventual  calculation  and  storage  of  probabilities. 


1.4 


P(  car  =  stopped  I  signal  =  red  )  =  Zi 


(1-1) 


P(  signal  =  red  I  car  =  stopped  )  =  (1-3) 

.  .  .  Knowledge  Acquisition 
Because  Bayesian  Forests  are  a  relatively  new  representation,  very  little  work  has 
been  done  to  date  on  methods  of  acquiring  knowledge  with  them.  Forests  provide  a  more 
powerful  knowledge  representation  [53],  [49],  but  it  remains  to  be  seen  if  this  new 
representation  better  lends  itself  to  new  or  existing  methods  of  knowledge  acquisition. 

Feigenbaum  [18]  describes  the  acquisition  of  new  knowledge  as  an  aspect  of  an 
automated  reasoning  system  which  eclipses  both  its  knowledge  representation  and  its 
inferencing  capabilities  in  importance.  Knowledge  acquisition’s  central  role  in  artificial 
intelligence  derives  from  the  fact  that  "knowledge  is  power"  [10].  Simply  stated,  the 
performance  of  an  expert  system  is  a  direct  function  of  both  the  quality  and  quantity  of 
the  encoded  knowledge.  Thus,  in  general,  more  knowledge  enhances  the  performance  of 
our  intelligent  systems.  However,  it  must  be  more  of  the  right  kind  of  knowledge, 
organized  and  codified  in  ways  applicable  and  appropriate  to  our  symbolic  computations. 

The  critical  subareas  of  an  expert  system  include:  its  reasoning  methodologies,  the 
organization  of  its  knowledge  base,  the  validity  of  that  knowledge,  and  of  course,  its 
interface  with  the  user  (See  Figure  1.1)  [53],  [27],  [70],  [28].  As  we  can  see  simply  from 
their  titles,  knowledge  is  integral  to  two  of  these  four  components.  As  such,  the  details 
of  knowledge  acquisition — especially  when  that  knowledge  includes  uncertainty,  i.e., 
incomplete  information  and/or  facts  which  are  neither  entirely  true  nor  entirely  false — can 
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become  major  impediments  to  the  development  of  real-world  expert  systems  [53]. 


.  ,  .  Knowledge  Acquisition  by  Articulation 
Currently,  knowledge  acquisition  methods  fall  into  one  of  three  categories: 
articulation,  induction  and  automated  [13].  Articulation  can  be  described  as  the  "person- 
to-person"  approach  to  knowledge  acquisition.  It  requires  the  services  of  a  knowledge 
engineer  to  interview  the  expert  about  her/his  area(s)  of  expertise,  to  decide  upon  an 
appropriate  knowledge  representation  for  the  data,  to  encode  the  information,  and  finally 
to  validate  the  accuracy  of  the  final  product.  Often,  articulative  methods  try  the  expert’s 
patience  since  they  include  the  engineer  shadowing  the  expert  in  and  around  the 
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workplace  constantly  inquiring  about  the  underlying  reasons  of  this  or  that  action  [27], 
[70],  [28]. 

Obviously,  the  success  of  articulation  methods  depends  heavily  not  only  upon  the 
skills  of  the  knowledge  engineer,  but  equally  so  upon  the  openness  of  the  expert  and  his 
or  her  ability  to  communicate  knowledge  to  the  engineer.  Much  research  has  been  done 
on  methodologies  to  improve  this  process  [57],  [20],  [19],  [37].  Ideally,  the  knowledge 
engineer  is  a  good  student — at  least  of  the  subject  at  hand — and  the  expert  is  a 
comparable  teacher  of  the  same.  In  effect,  the  engineer  is  apprenticing  him-  or  her-self 
to  the  expert  for  a  period  of  time  sufficient  only  to  glean  the  basics  and  transcribe  them 
into  machine  language.  In  reality,  though,  true  apprenticeships  involve  years  of  study  and 
practice.  While  it  is  still  the  most  common  form  of  knowledge  acquisition,  articulation 
is  both  a  time-  and  labor-intensive  process  that  iterates  as  the  knowledge  engineer 
regularly  returns  to  the  expert  for  verification  [13],  [27]. 

.  .  .  Knowledge  Acquisition  by  Induction 

Knowledge  acquisition  by  induction  [67]  requires  the  expert  to  enumerate  the 
members  of  a  set  of  random  variables,  now  called  attributes,  and  their  associated  values. 
Then,  for  all  possible  assignments  of  values  to  this  set,  s/he  asserts  a  hypothesis  or 
conclusion  for  each  member  attribute.  This  type  of  knowledge  acquisition  is  the  basis  for 
Quinlan’s  ID3  knowledge  acquisition  algorithm  [1 1].  Inductive  assumptions  presume  the 
given  hypotheses  to  be  mutually  exclusive  and  correct,  all  the  attributes  relevant  and  all 
attribute  values  discrete.  Unfortunately,  inductive  acquisition  techniques  such  as  ID3  pay 
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a  very  high  price  in  that  they  require  complete  test  cases  to  stave  off  degradations  in  rule- 
set  confidence.  Clearly,  as  the  numbers  of  attributes  and  values  grow,  the  size  of  the 
question  set  presented  to  the  expert  quickly  becomes  combinatorially  large  [13]. 

For  example,  a  set  of  eight  attributes,  four  of  which  have  three  possible  values,  and 
two  pairs  of  which  have  four  and  five,  yields  32,400  (3'‘  x  4^  x  5^)  value  combinations 
each  of  which  requires  eight  hypotheses  or  conclusions  from  the  expert  (one  for  each 
attribute).  Few  experts  can  be  expected  to  want  or  to  be  able  to  contribute  over  a  quarter 
of  a  million  inputs  of  unknown  complexity  for  such  a  small  system!  Larger  systems  are 
clearly  prohibitive. 

Moreover,  inductive  knowledge  acquisition  approaches  like  the  ID3  algorithm 
develop  rule  decision  trees  using  entropy  functions.  Such  trees  may  require  the  inference 
engine  to  analyze  the  entire  structure  for  a  single  hypothesis,  much  less  for  the  entire  set. 
Their  required  structure  builds  redundancy  into  the  knowledge  base  of  a  system  which  is 
already  costly  [13]. 


.  .  .  Automated  Knowledge  Acquisition 
The  last  approach,  automated  knowledge  acquisition,  makes  an  effort  completely 
to  dispense  with  the  knowledge  engineer.  In  actuality  the  knowledge  engineering  tasks 
have  been  transferred  elsewhere  in  the  software  development  process  such  that  the 
program  itself  can  query  the  expert  and  format  the  information  received.  This  method 
identifies  and  partitions  knowledge  into  modules  that  are  tailored  to  a  particular 
knowledge  representation  or  expert  system  shell.  The  use  of  modules  constrains  and 
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guides  the  expert  through  the  process  by  receiving  input,  checking  the  consistency  of  the 
data  already  entered,  and  querying  the  expert  to  correct  any  inconsistencies  found  [13], 
[69],  [35]. 


.  .  .  MACK 

Our  work  here  is  in  support  of  a  new,  integrated  expert  system  framework  called 
PESKi — Probabilities,  Expert  Systems,  Knowledge  and  Inference,  peski’s  overall 
goal  seeks  strong  semantics  and  more  effective  algorithms  for  reasoning  with  uncertainty 
[53].  Specifically,  this  paper  addresses  Knowledge  Acquisition  and  Maintenance,  one  of 


Knowledge  Organization  &  Validation 


Figure  1.2 


The  PESKI  Architecture 


[53 
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PESKl’s  four  major  components  (See  Figure  1.2^).  The  others — the  Inference  Engine,  the 
Explanation  &  Interpretation  facilities,  and  the  Natural  Language  Interface — are  being  or 
will  be  explored  separately  (See  [4],  [56]).  Because  of  the  manner  in  which  we  match 
the  application  to  a  method,  our  approach  to  the  development  of  a  knowledge  acquisition 
tool  for  PESKI  and  Bayesian  Forests  can  be  termed  "middle-out."^ 

In  this  paper  we  present  an  automated  knowledge  acquisition  tool  for  Bayesian 
Forests  called  mack,  the  Module  for  the  Acquisition  of  Consistent  Knowledge.  MACK 
is  a  tool  designed  not  only  to  acquire,  but  also  to  maintain,  knowledge  bases  while 
guaranteeing  the  consistency  of  the  knowledge  therein. 

By  combining  some  of  the  strengths  of  inductive  and  automated  knowledge 
acquisition  methods,  mack  enhances  the  development  of  probabilistic  reasoning  systems 
like  PESKI  while  also  addressing  issues  of  temporal  logic  and  default  reasoning  which 
often  arise.  Individually,  each  of  these  areas — ^probabilistic  reasoning,  temporal  reasoning, 
and  reasoning  with  default  assumptions — ^has  been  shown  to  be  NP-hard  [12],  [60],  [55], 
[49].  This  work  determines  their  intersection  with  the  new  PESKI  expert  system 
architecture  and  integrates  those  aspects,  as  appropriate,  into  the  Knowledge  Acquisition 


‘'Broken-bordered  objects  in  this  figure,  specifically  Knowledge  Engineer  and  Knowledge  Engineering  Tools,  are 
considered  optional  in  this  architecture  [53]. 

^The  "middle-out"  approach  develops  knowledge  acquisition  techniques  for  a  method  (in  this  case,  Bayesian 
Forests),  then  applies  them  to  one  or  more  particular  domains.  "Bottom-up"  tool-building  works  from  a  specific 
problem  domain  and  incorporates  much  of  the  domain  knowledge  into  the  resultant  tool.  The  "top-down"  uses 
general  analysis  and  minimal  knowledge  of  the  domain  [8],  [9],  [25],  [14],  [36],  [17],  [38]. 
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and  Maintenance  module,  instantiated  here  with  the  Bayesian  Forest  knowledge 
representation. 

Presently,  researchers  at  the  National  Aeronautics  and  Space  Administration 
(NASA)  are  developing  an  intelligent  system  to  facilitate  post-test  diagnosis  of  the  main 
engines  of  the  Space  Transportation  System,  more  commonly  known  as  the  Space  Shuttle 
[7],  [43],  [48].  Knowledge  acquisition  by  articulation  for  this  system  has  been  in  progress 
for  over  thirty  months,  yet  the  net  result  to  date  is  a  single  complete  component  which 
reasons  over  data  from  the  shuttle’s  high  pressure  oxidizer  turbopump.  We  apply  our 
efforts  to  this  part  of  the  NASA  project. 

No  previous  work  on  the  Post-Test  Diagnostic  System  (PTDS)  has  taken  a 
probabilistic  approach  to  the  problem.®  In  fact,  system  designers  rejected  both  rule-based 
and  probabilistic  methodologies  in  favor  of  a  case-based  look-up  scheme  for  the  systems 
module.  They  opted  against  the  former  because  of  causal  feedback  loops  in  the 
knowledge,  while  the  use  of  probabilities  was  estimated  to  add  an  order  of  magnitude  to 
problem  complexity  [5].  Bayesian  Forests,  however,  provide  the  tools  to  handle  the 
causal  loops  and  the  flexibility  better  to  balance  the  strictures  of  probabilistic  reasoning 
with  the  uncertainties  common  to  a  real-world  domain  like  PTDS.’  It  has  been  our 
intention  to  demonstrate  these  capabilities  of  Bayesian  Forests  via  the  re-development  of 


®Real-time  sensor  data  validation  modules  do  use  Bayesian  information  fusion  techniques.  However,  the  sensor 
validation  system,  which  has  indirect  control  of  a  manned  vehicle,  is  separate  from  the  solely  advisory  Post-Test 
Diagnostics  [6],  [73]. 

^Previous  work  on  the  particular  PTDS  component  we  chose  presently  handles  uncertainty  only  by  prepending 
the  words:  "possible,"  "may  indicate,"  or  "may  be"  to  91.6%  of  non-historical  anomaly  reports. 
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the  high-pressure  oxidizer  turbopump  component  using  the  MACK  tool.  Chapter  2 
explains  the  concerns  we  faced  specific  to  Bayesian  Forests  and  how  automatically  to 
identify  inconsistencies  found  in  Bayesian  Forest  knowledge  bases.  Chapter  3  describes 
the  Space  Shuttle  application  itself,  while  Chapter  4  addresses  procedural  issues  of 
knowledge  acquisition  using  MACK. 
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2.  BAYESIAN  FORESTRY 


Up  to  this  point  we  have  alluded  to  Bayesian  Forests  without  a  proper  description. 
Bayesian  Forests,  the  most  recent  manifestation  of  Santos’  Bayesian  Multinetworks  [49],* 
are  a  rule-based  probabilistic  reasoning  methodology.  Probabilistic  reasoning  in  intelli¬ 
gent  systems  exploits  the  fact  that  probability  theory  is  and  has  been  an  accepted  language 
both  for  the  description  of  uncertainty  and  for  making  inferences  from  incomplete 
knowledge  [33].  Using  the  semantics  of  probability  theory,  we  designate  random 
variables  to  represent  the  discrete  objects  or  events  in  question.  We  then  assign  a  joint 
probability  distribution  to  each  possible  state  of  the  world,  i.e.,  a  specific  value 
assignment  for  each  random  variable.  This  assignment  allows  us  to  reason  over  the  set 
of  probabilities  [2],  [3]. 

Unfortunately,  the  size  of  this  joint  distribution  can  grow  exponentially  in  the 
number  of  random  variables  making  solutions  and  knowledge  acquisition  computationally 
intractable  [33],  [42].  One  way  to  address  this  complexity  is  by  assuming  many,  if  not 
all,  of  the  random  variables  to  be  independent.  However,  while  such  independence 
assumptions  significantly  facilitate  knowledge  acquisition  and  ultimate  resolution,  if 
applied  carelessly,  they  can  oversimplify  the  problem  statement  such  that  the  final  answer 
loses  considerable  validity  [33]. 

^Originally  dubbed  "Bayesian  Multi-Nets,"  the  name  has  been  changed  to  "Bayesian  Forests"  to  distinguish  them 
from  the  Bayesian  multinetworks  of  Heckerman  and  Geiger  [22]. 
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Bayesian  systems’  avoid  oversimplification  by  couching  their  independence 
assumptions  in  terms  of  conditional  dependencies.  Let  D,  E  and  F  be  random  variables. 
The  conditional  probability,  P(DIE,F),  identifies  the  belief  in  D’s  truth  given  that  E  and 
F  are  both  known  to  be  true.  If  P(DIE,F)  =  P(DIE),  we  say  that  random  variables  D  and 
F  are  conditionally  independent  given  E.  In  other  words,  once  we  know  E  is  true,  we  can 
establish  D’s  truth  with  or  without  any  knowledge  of  F’s  [41]. 

Bayesian  philosophy  holds  that  such  conditional  relationships — e.g.,  P(DIE) — are 
more  in  keeping  with  the  way  humans  tend  to  organize  knowledge  [31],  [59],  [58].  Equa¬ 
tion  (2-1)  below  shows  Bayes’  Formula  for  computing  these  probabilities.  We  can  view 

P(DIE)  -  (2-1) 


random  variable  D  as  a  possible  hypothesis  (or  set  of  hypotheses)  held  in  advance  and 
E  as  the  actual  evidence  that  was  or  will  be  generated.  The  formula  shows  how  previous 
hypotheses  should  be  modified  in  light  of  that  new  evidence  [45]. 

P(E,D)  =  P(DIE)  P(E)  (2-2) 

Equation  (2-2)  manipulates  Bayes’  Formula  to  allow  us  to  compute  the  joint  distri¬ 
bution.  It  generalizes  to  n  variables  as  shown  in  Equation  (2-3)®.  More  importantly  to 


P(D,E,F,G,H)  = 


P(DIE,F,G,H)  P(EIF,G,H) 
P(GIF,H)  P(HIF)  P(F) 


(2-3) 


^Equation  (2-3)  shows  one  of  the  n!  possible  expansions,  n  is  obviously  equal  to  5  in  this  example. 
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PESKi’s  inferencing,  the  subsequent  incorporation  of  the  known  independence  conditions 
further  reduces  the  amount  of  information  we  must  actually  store  to  be  able  to  compute 
the  required  joint  probability  [41],  [39],  [33],  [52],  [51],  [54].  Thus,  let  us  assume  that 
random  variable  E  is  known  to  be  conditionally  independent  from  both  F  and  H  when  the 
value  of  G  is  known,  and  D  is  likewise  conditionally  independent  with  knowledge  of  both 
E  and  G.  Then,  we  can  simplify  Equation  (2-4)  further  still  (see  Equation  (2-3)). 

P(D,E,F,G,H)  =  P(DIE,G)  P(EIG)  P(GIF,H)  P(HIF)  P(F)  (2-4) 

These  conditional  dependencies  can  also 
be  represented  pictorially  with  a  directed 
graph.  Furthermore,  the  graph  must  be 
acyclic  for  a  Bayesian  network,  else  it 
would  be  impossible  to  determine  an 
equation  for  computing  the  joint 
distribution  similar  to  equation  (2-4)  using 
the  conditional  dependencies. 

We  illustrate  Bayesian  Forests  with 
just  such  a  graphical  comparison  to  the 
more  conventional  Bayesian  network.  Let 
A,  B  and  C  be  random  variables  in  a 
Bayesian  network  representing  a  traffic  light,  its  associated  vehicle  detector  and  pedestrian 
signal,  respectively.  Figure  2.1  graphically  depicts  this  network  over  these  variables. 
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Since  the  signal  depends  upon  the  light,  we  say  that  A  is  the  parent  of  C.  Similarly,  B 
is  the  parent  of  A. 

Now,  assume  we  want  for  some  reason  to  expand  this  set  with  a  probability  for  the 
detector  stating  the  likelihood  of  its  being  tripped  during  rush  hour.  Such  an  inclusion 
would  introduce  a  cycle  into  our  Bayesian  network  since  the  detector  and  traffic  light 
cannot  both  depend  upon  the  other.  It  becomes  synonymous  to  the  classic  circular 
reasoning  example:  "If  Smoke,  then  Fire"  coupled  with  "If  Fire,  then  Smoke." 

Herein  lies  the  added  flexibility  of  the  Bayesian  Forest.  Assuming  the  same  trio 
of  random  variables  and  the  partial  set  of  values  below: 


P(C  =  "Don’t  Walk"  1  A  =  red)  =  Xj 

(2-5) 

P(C  =  "Walk"  1  A  =  green)  =  Xj 

(2-6) 

P(A  =  green  1  B  =  On)  =  Xj 

(2-7) 

P(A  =  red  1  B  =  Off)  =  X4 

(2-8) 

We  can  quite  legally  add  the  new  constraint: 

P(B  =  On  I  A  =  red,  D  =  rush  hour)  =  (2-9) 

without  creating  a  directed  cycle.  Figure  2.2  shows  the  graphical  representation  of  this 
example. 

Notice  how  the  graph  of  a  Bayesian  Forest  is  a  simple,  bipartite  and  directed.*® 
It  has  two  distinct  types  of  nodes.  The  first,  shown  as  lettered  ovals,  corresponds  to 
nodes  found  in  a  Bayesian  network.  However,  now  these  nodes  are  particular  not  simply 


'“See  [71]  for  further  discussion  of  graph  definitions. 
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to  a  random  variable,  but  to  a  specific 
instantiation  thereof.  Hence  Figure  l.l’s 
single  node  for  variable  A,  the  traffic 
light,  becomes  two  distinct  instantiation 
nodes  for  "A  =  Red"  and  "A  =  Green"  in 
Figure  2.2. 

The  second  type  of  node,  depicted 
as  a  blackened  circle,  is  called  a  support 
node.  When  drawn,  these  nodes,  which 
represent  the  numeric  probability  value 
itself,  have  exactly  one  outbound  arrow  to 
the  instantiation  node  representing  the 
probability’s  head.  Support  nodes  also 
provide  a  graphical  terminus  for  zero  or  more  inbound  dependency  or  conditioning  arrows 
from  each  of  the  parent  instantiations  in  the  probability’s  tail.  Conversely,  all 
instantiations  nodes  must  have  one  or  more  inbound  arrows  to  establish  their  probabilities 
and  sets  of  supporting  conditions,  but  need  not  contain  any  outbound  arrows. 

Reasoning  algorithms  for  Bayesian  Forests  are  built  upon  inference  engines 
previously  developed  for  weighted,  "AND/OR"  directed  acyclic  graphs  (WAODAG)  [49], 
[4],  [41],  [39].  In  these  schema  Bayesian  Forest  support  nodes  correspond  to  WAODAG 
AND  nodes  precisely  because  all  parent  instantiation  nodes  must  be  active  or  true  before 
the  support  may  be.  Likewise,  a  Bayesian  Forest’s  instantiation  nodes  correspond  roughly 


2.5 


to  the  WAODAG’s  OR  node  since  any  one  of  an  instantiation  node’s  support  conditions 
is  sufficient  to  activate  it.  The  correspondence  is  inexact  in  that  the  Bayesian  Forest 
instantiation  node  actually  represents  an  exclusive-or  condition.  Any  support  condition 
may  substantiate  an  instantiation  node,  but  that  support  node  must  be  the  only  one  active. 

As  we  have  seen,  Bayesian  Forests  allow  ambidirectional  construction,  i.e.,  we  can 
have  P(A=ai  I  B=b2,  C=Ci)  and  P(B=b2 1  A=a2)  in  our  database  simultaneously.  However, 
this  additional  constructive  capacity  changes  the  concept  of  consistency  for  a  knowledge 
base  in  ways  unique  to  Bayesian  Forests. 

Bayesian  Forest  inference  algorithms  operate  straightforwardly  by  computing  a  joint 
probability  for  a  particular  assignment  of  values  to  every  variable  available  in  the  random 
variable  set — thenceforth  we  will  refer  to  such  an  assignment  as  a  "state  of  the  world." 
In  an  inconsistent  forest  there  will  be  multiple  ways  to  compute  this  value  for  any  particu¬ 
lar  state  of  the  world  with  no  guarantee  that  the  results  will  be  equal  as  they  must  [49]. 

To  illustrate,  let  X,  Y  and  Z  be  boolean  random  variables.  Clearly,  there  are  eight 
possible  states  of  the  world.  Given  the  following  probabilities  as  the  entire  population 
of  the  forest’s  database: 


P(X 

=  true  1  Y  =  false)  = 

.40 

(2-10) 

P(X 

=  true  1  Z  =  true)  = 

.80 

(2-11) 

P(Y 

=  false  1  X  =  true)  = 

.70 

(2-12) 

P(Y 

=  false  1  Z  =  true)  = 

.45 

(2-13) 

P(Z  =  true)  =  .75 

(2-14) 
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The  inference  engine  may  compute  P(X  =  true,  Y  =  false,  Z  =  true)“  by  multiplying 
equations  (2-10),  (2-13)  and  (2-14)  or  by  multiplying  equations  (2-12),  (2-11)  and  (2-14). 
However,  the  joint  probability  on  the  first  of  these  logical  paths  is  .135,  while  along  the 
second  path  it  is  .42,  more  than  three  times  greater! 

In  order  to  develop  a  functional  inference  engine,  we  eliminate  this  inconsistency 
by  requiring  any  group  of  compatible*^  probabilities  which  share  a  head  to  have  exactly 
equal  probability  values.  In  other  words,  we  guarantee  that  the  probabilistic  value  of 
equation  (2-11)  is  the  same  as  the  value  of  equation  (2-10)  and  that  equation  (2-13)  is 
also  equal  to  (2-12)  as  shown  below. 


P(X  =  true  1  Y  =  false)  =  .40 

(2-lOa) 

P(X  =  true  1  Z  =  true)  =  tS©  .40 

(2- 11  a) 

P(Y  =  false  1  X  =  true)  =  .70 

(2- 12a) 

P(Y  =  false  1  Z  =  true)  =  AS-  .70 

(2- 13a) 

P(Z  =  true)  =  .75 

(2- 14a) 

While  this  equality  requirement  clearly  forbids  inconsistencies,  it  does  little  either 
to  explain  or  to  assist  the  knowledge  engineer  in  his  or  her  efforts  to  build  the  system. 
The  engineer’s  problem,  then,  is  to  determine  the  equating  formula  which  will  be  used 


“Note  that  there  is  insufficient  data  here  to  compute  any  other  joint  probability. 

“We  define  probabilities  to  be  incompatible  or  mutually  exclusive  only  if  there  exists  a  random  variable  in  the 
tail  (see  footnote  2)  of  both  which  takes  on  a  different  value  in  each. 

P(Y  =  false  I  Z  =  true)  =  .45 
P(Y  =  false  I  Z  =  false)  =  .83 

For  example,  these  probability  equations  are  incompatible  since  both  place  conditions  upon  random  variable  Y’s 
being  false  and  the  variable  Z  assumes  a  different  value  in  their  tails.  Similarly  conditioned  probabilities  with  non¬ 
identical  heads  are  considered  mutually  exclusive. 

“in  this  case  we  have  arbitrarily  set  the  value  of  the  second  member  of  each  pair  equal  to  the  first. 
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initially  to  create  a  viable  knowledge  base.  Such  formulae  are  countless — e.g.,  minima, 
maxima,  weighted  or  unweighted  averages,  any  real  function  over  the  values,  etc. 
Moreover,  in  the  absence  of  other  information,  all  can  be  equally  valid.  This  makes  an 
algorithm  to  construct  Bayesian  Forests  all  the  more  elusive. 

Before  we  can  develop  a  knowledge  acquisition  methodology,  we  must  be  aware 
of  those  areas  of  a  Bayesian  Forest  with  the  potential  to  violate  its  construction 
constraints  or  to  harbor  inconsistent  bits  of  knowledge.  Once  identified,  we  can  ensure 
both  the  validity  and  consistency  of  a  forest  by  induction  as  it  is  being  built.  However, 
unlike  inductive  systems  predicated  on  the  IDS  algorithm,  Bayesian  Forests  have  no 
requirement  for  the  complete  specifications  of  all  attributes  and  values  which  make  those 
systems  less  tenable  for  large  data  sets. 

Following  are  descriptions  of  the  eight  construction  constraints  we  determined,  as 
well  as  the  manner  in  which  our  implementation  satisfies  them.  Taken  together,  they 
guarantee  both  the  structure  of  the  Bayesian  Forest  and,  more  importantly,  its  probabilistic 
validity.  The  fact  that  there  are  only  eight  underscores  the  flexibility  of  the  Bayesian 
Forest  representation  and  its  ability  to  obey  the  laws  of  probability  theory  while  still  being 
general  enough  directly  to  interface  with  the  expert.  Appendix  A  is  an  object-oriented 
object  model  of  a  Bayesian  Forest  using  Z  notation  [26],  [66],  [46].  The  model  gives  a 
formal  mathematical  specification  of  each  constraint. 

o  Constraint  1:  This  first  constraint  recognizes  the  fact  that  a 

Bayesian  Forest  stores  all  its  probabilistic  information  in  its  support  nodes. 
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rather  than  its  instantiation  nodes.  Thus,  we  restate  here  the  requirement  that 
all  instantiation  nodes  in  a  forest  be  the  head  of,  at  least,  one  support  node. 

o  Constraint  2:  Because  we  implement  instantiation  nodes  and 

support  nodes  as  separate  but  interrelated  object  classes,  this  constraint 
ensures  that  the  instantiations  referenced  by  any  support  node,  i.e.,  the 
sources  and  sinks  of  its  arrows  in  the  graph,  be  well-defined. 

o  Constraint  3:  Here  we  simply  guarantee  that  all  instantiation  nodes 

taken  together  form  a  set,  not  a  family  [71],  i.e.,  contain  no  duplicate  instan¬ 
tiations.  Different  instantiations  of  the  same  item  must  have  distinct  values. 

o  Constraint  4:  This  fourth  constraint  encapsulates  Santos’  original 

requirement  that  any  support  nodes  which  share  a  head  instantiation  must  be 
mutually  exclusive  [49].  Given  any  state  of  the  world,  all  but  one  of  an 
instantiation  node’s  support  conditions  must  conflict  with  that  particular 
assignment  of  global  values.  In  the  parlance  of  logic,  we  could  say  constraint 
4  requires  the  truth  or  falsity  of  any  instantiation  node  to  be  established  via 
an  exclusive-or  condition  among  its  attendant  support  nodes. 


P(X  =  true  I  Y  =  true)  =  (2-15) 

P(X  =  true  I  Y  =  true,  Z  =  false)  =  y2  (2-16) 

P(X  =  true  I  Y  =  false,  Z  =  false)  =  ^3  (2-17) 

P(X  =  true  I  Y  =  false,  Z  =  true)  =  ^4  (2-18) 
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For  example,  the  equations  above  show  a  set  of  support  conditions  using 
boolean  random  variables:  X,  Y  and  Z.  Clearly,  each  of  these  support 
conditions  modifies  the  same  head  instantiation:  setting  X  to  "true." 
However,  probabilities  (2-15)  and  (2-16)  are  not  mutually  exclusive,  since  the 
former,  which  does  not  depend  upon  random  variable  Z,  will  always  be  valid 
any  time  that  the  latter  is.  In  this  case  when  Y  is  true,  either  probability 
affords  a  valid  inference  path  to  substantiate  X’s  truth,  thus  yi  must  equal  y2 
for  the  database  to  be  consistent. 

Constraint  1  guarantees  there  will  be  one  or  more  support  nodes  for  each 
instantiation.  This  fourth  constraint  provides  the  necessary  distinctions 
between  those  support  nodes  such  that  one,  and  only  one,  may  be  active. 

o  Constraint  5:  The  fifth  constraint  closes  the  door  on  logical  cycles. 

Given  a  particular  inference  chain,  it  prevents  the  reoccurrence  of  a  support 
node’s  head  in  the  tails  of  its  successors  in  that  chain.  For  example, 
equations  (2-19)  through  (2-22)  below  form  a  loop  in  that  all  four  can  be 


P(A  =  ai  1  D  =  dz) 

(2-19) 

P(D  =  d2  1  B  =  bj,  C  =  Cl) 

(2-20) 

P(B  =  bj  1  E  =  ez) 

(2-21) 

P(E  =  e2 1  A  =  ai,  C  =  Ci) 

(2-22) 

simultaneously  active,  i.e.,  no  mutual  exclusivities  exist  within  the  set,  and 
equation  (2-22)  depends  in  part  upon  the  head  instantiation  of  a  predecessor. 
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in  this  case  "A  =  ai."  Failure  to  preclude  this  cycle  would  allow  the  infer¬ 
ence  engine  potentially  to  enter  an  infinite  loop  since  equation  (2-19)  can 
clearly  be  re-investigated  as  a  successor  to  (2-22), 

As  with  many  search  problems,  discovering  these  cycles  can  quickly 
become  combinatorial.  However,  we  can  conduct  this  search  as  the  expert 
identifies  support  conditions  which  ensures  the  consistency  of  the  knowledge 
base.  This  also  assists  the  expert  in  correcting  inconsistencies  by  flagging 
them  sooner,  i.e.,  upon  introduction  to  the  database,  rather  than  later.  We 
accomplish  this  search  cheaply  and  efficiently  using  a  depth-first  algorithm 
which  begins  at  the  head  of  the  new  support  node  and  branches  throughout 
the  Bayesian  Forest  structure,  as  necessary. 

o  Constraint  6:  Because  we  are  concerned  only  with  one  particular 

state  of  the  world  at  a  time,  it  is  obviously  unacceptable  to  have  one 
instantiation  of  an  item  depend  upon  a  different  instantiation  of  that  same 
item  since  both  can  never  be  true  simultaneously.  In  fact,  the  probability 
shown  in  equation  (2-23)  and  all  others  like  it  would  always  have  to  equal 
zero,  per  force. 

P(Y  =  true  1  Y  =  false,  Z  =  true)  (2-23) 

Moreover,  it  is  even  less  acceptable  for  the  veracity  of  an  instantiation  to  be 
dependent  upon  the  instantiation  itself  as  in  equation  (2-24). 
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P(Y  =  true  I  X  =  false,  Y  =  true) 


(2-24) 


o  Constraint  7:  Following  similar  logic  as  constraint  6,  one  item 

cannot  be  conditioned  by  multiple  values  of  another  precisely  because  the 
instantiations  of  the  second  item  obviously  contradict  themselves.  Thus, 
equation  (2-25)  is  clearly  invalid  since  the  random  variable  X  cannot  both  be 
true  and  false  at  the  same  time. 

P(Y  =  true  I  X  =  false,  Z  =  true,  X  =  true)  (2-25) 


o  Constraint  8:  Because  we  are  using  a  probabilistic  reasoning 

scheme,  we  need  this  last  constraint  to  disallow  any  simultaneously  valid  sets 
of  probabilities  for  the  same  item  from  ever  summing  to  values  greater  than 
1.  We  use  the  fact  that  the  head  instantiations  of  the  sets’  elements  share  the 
same  random  variable  to  identify  each  set. 

In  the  example  below,  we  have  collected  all  support  conditions  for 
random  variable  A  (regardless  of  instantiated  value)  which  depend  upon 
random  variable  B’s  first  value. 


> 

II 

1  B  =  bi,  C  =  Cl) 

=  V 

(2-26) 

11 

< 

1  B  =  bj,  C  =  C2,  D  =  d]) 

=  w 

(2-27) 

II 

< 

1  B  =  bi) 

=  X 

(2-28) 

P(A  =  a3 

1  B  =  bi,  D  =  di) 

=  y 

(2-29) 

P(A  =  a3 

1  B  =  bi,  D  =  d2) 

=  z 

(2-30) 
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Equations  (2-26)  and  (2-27)  can  never  be  active  at  the  same  time  since  they 
depend  on  different  instantiations  of  C.  Similarly,  equation  (2-30)  is 
mutually  exclusive  both  with  equation  (2-27)  and  equation  (2-29)  due  to 
random  variable  D.  Under  constraint  8  these  dependencies  on  different  states 
of  the  world  divide  this  set  of  equations  such  that  the  following  inequalities 
must  all  be  true: 


X  +  1 

(2-31) 

V  +  X  +  z  ^  1 

(2-32) 

w  +  X  +  y  <  1 

(2-33) 

V  +  X  +  y  <  1 

(2-34) 

MACK  automatically  normalizes  the  values  of  any  probabilities  which 
violate  this  constraint  simply  by  dividing  each  element  of  the  set  by  the 
total.*'*  Notice  that  equation  (2-28)’s  probability,  x,  is  a  factor  in  all  these 
subsets  since  it  does  not  depend  either  on  C  or  D  and  can  thus  be  simulta¬ 
neously  true  with  any  of  the  other  four.  We  also  note  that  in  this  example 
equation  (2-32)  effectively  overrides  (2-31)  because  if  the  former  holds,  then 
the  latter  must  also  a  fortiori}^ 


A  complete  consistency  check  essentially  involves  verification  of  each  of  the  eight 
aforementioned  constraints.  Obviously,  such  an  approach  may  become  computationally 

'''The  pre-normalized  value  is  maintained  in  the  system  for  use  in  subsequent  normalizations  over  different  sets 
of  probabilities. 

'^v,  w,  X,  y,  z  are  all  non-negative  real-valued  variables  between  0  and  1. 
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expensive,  especially  as  the  size  of  the  network  grows.  However,  we  have  implemented 
our  Bayesian  Forest  knowledge  acquisition  routine  such  that  complete  consistency 
checking  is  rarely  necessary.  Specifically,  guarantors  for  constraints  2,  3,  6  and  7  are 
built  into  the  object  creation  routines.  Thus,  as  the  expert  defines  the  items  of  the  forest, 
their  associated  values  and  dependencies,  the  software  objects  themselves  prohibit  dupli¬ 
cate  instantiations  and  ensure  the  validity  of  all  references  between  the  instantiation  node 
and  support  node  classes.  Constraints  1,  4,  5  and  8,  then,  are  the  only  constraints 
explicitly  tested  during  a  review.  In  addition,  our  Bayesian  Forest  implementation 
conducts  such  reviews  incrementally.  We  check  each  new  support  node  as  it  is  entered 
into  the  database  to  ensure  it  introduces  no  new  inconsistencies  to  the  existing  consistent 
forest,  e.g.,  by  engendering  a  logical  cycle. 

We  now  turn  our  attention  to  an  actual  application  of  MACK.  The  application  comes 
from  the  National  Aeronautics  and  Space  Administration  (NASA).  The  next  chapter 
discusses  NASA’s  Post-Test  Diagnostic  System  for  the  Space  Shuttle’s  main  engines  and 
how  the  MACK  knowledge  acquisition  tool  performed  in  that  domain. 
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3.  THE  POST-TEST  DIAGNOSTIC  SYSTEM 


Development,  maintenance  and  improvement  of  any  large  system,  especially  one 
with  human  lives  at  stake,  usually  involves  extensive  testing.  NASA’s  Space 
Transportation  System,  or  Space  Shuttle,  is  no  exception.  Marshall  Space  Flight  Center 
in  Huntsville,  Alabama  routinely  conducts  ground  tests  and  collects  actual  flight  data  on 
the  shuttle’s  boosters  better  to  assess  the  health,  status  and  current  capabilities  of  the 
reusable  engines  and  their  many  components.  Presently,  these  assessments  involve  large 
teams  of  engineers  who  review  remote  data  received  from  hundreds  of  on-board  sensors 
called  PIDs.  Officials  then  use  these  manual  reviews  to  determine  the  fitness  of  the 
engine  for  another  test  or  flight  [7]. 

The  Post-Test  Diagnostic  System  is  an  on-going  cooperative  project  to  automate  the 
Space  Shuttle  Main  Engine  (SSME)  review  process  using  intelligent  systems.  Its  stated 
goals  are  [73]: 

to  aid  in  the  detection  and  diagnosis  of  engine  anomalies 
to  increase  accuracy  and  repeatability  of  rocket  engine  data  analysis 
to  reduce  analysis  time 

When  complete,  its  components  will  validate  engine  sensors,  reliably  extract  salient 
features  from  telemetry  data,  and  analyze  SSME  performance  systems,  combustion 
devices,  turbomachinery  and  dynamics. 
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As  of  this  writing,  two  different  versions  of  one  component — the  High  Pressure 
Oxidizer  Turbopump  (HPOTP) — ^have  been  built  and  validated  by  government  contractors 
under  the  auspices  of  researchers  at  NASA  Lewis  Research  Center  in  Cleveland,  Ohio/^ 
These  systems  provided  us  the  opportunity  to  test  mack’s  applicability  to  a  real-world 
domain  and  a  set  of  known  parameters  against  which  to  corroborate  the  utility  of  MACK- 
acquired  knowledge  for  Bayesian  Forest  reasoning. 

The  HPOTP  is  an  engine  component  designed  initially  to  raise,  then  to  maintain  the 
pressure  of  the  liquid  oxygen  flowing  into  the  engine  at  the  varying  levels  of  thrust  during 
the  shuttle’s  flight  profile  [64].  Using  a  turbine  powered  by  the  oxidizer  prebumer’s 
hydrogen-rich  hot  gas,  this  centrifugal  pump  manages  the  flow  of  liquid  oxygen  into  the 
engine’s  main  and  prebumer  injectors.  Beside  the  pumps  and  turbines,  the  HPOTP’s  third 
major  group  of  subcomponents  contains  the  extensive  shaft  seals  which  separate  pumps, 
turbines  and  the  fluids  they  regulate  [64]. 

Being  an  automated  tool,  MACK  is  designed  to  be  operated  directly  by  the  domain 
expert.  In  fact,  it  is  a  key  component  of  the  PESKI  Knowledge  Organization  and 
Validation  subsystem  which  considers  a  human  knowledge  engineer  optional  [53].  As  a 
result,  we,  the  knowledge  engineers,  simulated  the  expert’s  involvement.^^  We  note, 
however,  that  much  of  the  previous  knowledge  engineering  accomplished  for  HPOTP  has 
involved  collating  and  sorting  the  information  gathered  in  numerous  interviews  with  the 


^®The  first  was  developed  by  Science  Applications  International  Corporation,  San  Diego,  California  [43],  The 
second  by  personnel  from  Aerojet  Propulsion  Division,  Sacramento,  California  [7]. 

^^Future  work  with  MACK  will  involve  its  direct  use  by  experts  to  develop  a  new  expert  module. 
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Alabamian  crew  of  rocket  scientists.^*  Appendix  B  correlates  selected  text  from 
knowledge  acquisition  interviews  with  anomaly  definitions  from  the  second  version  of 
HPOTP  and  with  conditional  probabilities  in  the  HPOTP  Bayesian  Forest.  These  correla¬ 
tions  show  that  it  is,  in  fact,  plausible  partially  or  completely  to  remove  the  middleman 
and  allow  the  expert  to  be  his/her  own  knowledge  engineer — Le.,  if  the  expert  is  so 
inclined,  s/he  can  with  minimal  instruction  create  a  Bayesian  Forest  from  scratch. 

Appendix  C  contains  transcripts  of  a  knowledge  acquisition  session  used  to  create 
part  of  a  Bayesian  Forest  for  HPOTP.  Session  data  were  taken  from  interview  transcripts 
and  reflect  some  prior  knowledge  engineering.  We  also  presume  to  reason  over  the 
salient  information  gathered  from  the  raw  data  by  the  existing  feature  extractor  [43]. 


'*NASA  Lewis  researchers  have  been  conducting  interviews  with  Marshall  Space  Flight  Center  engineers  since 
Spring  1992. 
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4.  KNOWLEDGE  PROCESSING  WITH  MACK 


Having  described  both  Bayesian  Forests  and  the  HPOTP  application,  we  now 
explore  some  of  the  processes  by  which  MACK  acquires  knowledge.  The  PESKI 
architecture  assumes  the  knowledge  engineer  to  be  optional  (see  Figure  1.2)  [53].  As  a 
result,  the  MACK  tool,  like  those  discussed  by  Sandahl  [47],  is  intended  to  be  the  primary 
interface  with  the  expert.  This  role  places  a  premium  on  user-friendliness  as  much  as 
adherence  to  Bayesian  Forest  constructs  and  probabilistic  formalisms. 

MACK  is  a  menu-driven  system.  These  menus  allow  us  to  handle  the  simpler 
Bayesian  Forest  constraints — constraints  #2,  3,  6  &  7  (see  Chapter  2) — by  simple 
manipulation  of  the  menu  options  presented  to  the  user.  Other  illegal  choices  simply  trap 
program  control  until  a  valid  selection  is  entered.  The  examples  excerpted  below  are 
taken  from  the  HPOTP  application.*®  It  includes  data  entry  of  the  conditions  governing 
the  shift  anomaly  noted  via  sensor,  PID  990,  here  called  "Anomaly  990  Shift."  This 
anomaly  depends  upon  the  sensor’s  peak  and  equilibrium  values  which  are  represented 
by  the  random  variables,  "PID  990  Peak"  and  "PID  990  Equilibrium,"  respectively.  Here 
the  expert  is  creating  the  first  support  condition  for  the  instantiation  of  the  random 
variable  "Anomaly  990  Shift"  to  value  "Found. 


'®See  Appendix  C  for  the  entire  transcript. 

^“Anomaly  variables  are  basically  boolean:  "Found"  or  "Not  Found." 
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At  present,  Anomaly  990  Shift’s  being  Found  depends  upon  the  foiiowing 
sets  of  conditions; 


No  support  conditions! 

Enter  0  to  add  new  support  conditions  for 
Anomaly  990  Shift’s  being  Found 

Otherwise,  enter  1  to  quit 

0 

Recall  that  Constraint  2  requires  the  instantiations  and  supports  connected  in  a 
Bayesian  Forest  to  be  well-defined.  Thus  the  system  when  creating  a  support  condition 
only  presents  a  choice  among  the  previously  instantiated  random  variables.  MACK 
additionally  restricts  the  options  to  those  variables  which  can  actually  be  used  in  the 
nascent  support  condition.  We  can  see  this  in  the  absence  of  Anomaly  990  Shift  itself 
from  the  subsequent  menu  shown  below  which  is  in  keeping  with  Constraint  6. 

Anomaly  990  Shift’s  being  Found 
can  depend  upon  which  of  the  following  components: 

2  -  PID  990  Equiiibrium 

3  -  PID  990  Peak 

0  “  None  of  the  Above  Components 

Choice:  2 

1  -  Nominai 

2  “  Out  of  Famiiy 

0  -  None  of  the  Above;  Abort 

Choice:  2 

Having  already  selected  a  value  of  PID  990  Equilibrium,  the  expert  is  now  queried 
for  continuance.  In  this  abbreviated  example  we  see  that  the  only  remaining  random 
variable  option  available  to  the  expert  is  PID  990  Peak.  Anomaly  990  Shift  has  been 
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previously  removed  under  Constraint  6  and  PID  990  Equilibrium,  as  a  new  addition  to 
the  condition,  is  now  illegal  in  accordance  with  Constraint  7. 


Presently,  this  condition  holds  that  Anomaly  990  Shift’s  being  Found  can 
depend  upon  the  following: 

PID  990  Equilibrium  =  Out  of  Family 

Do  you  wish  to  extend  this  condition?  Y  /  N  v 

Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 

3  -  PID  990  Peak 

Choice:  3 


1  --  Nominal 

2  --  Out  of  Family  -  High 

3  --  Out  of  Family  --  Low 

0  --  None  of  the  Above;  Abort 

Choice:  2 

Presently,  this  condition  holds  that  Anomaly  990  Shift’s  being  Found  can 
depend  upon  the  following: 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 

Do  you  wish  to  extend  this  condition?  Y  /  N  n 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  -  probable 

4  --  almost  certain 

It  is _ that  the  Anomaly  990  Shift  is  Found  depending  upon 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 

Choice:  3 
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MACK  then  presents  the  expert  with  a  menu  of  choices  from  which  it  will  internally 
derive  the  support  condition’s  probability.  Since  a  Bayesian  Forest’s  probabilistic  nature 
is  masked  from  the  expert,  we  use  only  the  qualitative  and  linguistic  terms  shown  below 
with  their  current  value  ranges.^' 


inconceivable: 

0.00  -  0.10 

not  likely: 

0.10  -  0.35 

possible: 

0.35  -  0.65 

probable: 

0.65  -  0.90 

almost  certain: 

0.90  -  1.00 

It  is  important  to  note  here  that  during  knowledge  acquisition  for  a  Bayesian  Forest, 
the  actual  numeric  value  assigned  to  any  given  probabilities  is  not  significant. 
Refinement  of  these  values  through  belief  revision  and  belief  updating  is  the  province  of 
the  forest’s  reasoning  and  explanation  facilities.^  The  values  associated  with  each  node 
only  attain  meaning  after  the  inference  engine  reasons  over  them  during  belief  updating. 
It  should  be  obvious,  however,  the  inference  engine’s  propagation  of  probabilities  must 
begin  somewhere.  In  his  discussion  of  the  validity  of  such  values  to  probabilistic 
reasoning  schemes.  Pearl  [41]  writes: 

[p.  148,  The]  conditional  probabilities  characterizing  the  links  in  the  network  do 
not  seem  to  impose  definitive  constraints  on  the  probabilities  that  can  be  assigned 
to  the  nodes.  .  .  .  The  result  is  that  any  arbitrary  assignment  of  beliefs  to  the 
propositions  a  &.b  can  be  consistent  with  the  value  of  ?{a\b)  that  was  initially 
assigned  to  the  link  connecting  them  .... 


^'Other  research  in  this  area  has  shown  the  commonality  of  adjective  connotations.  Current  work  seeks  to  refine 
these  further  [23],  [16],  [34],  [2],  [3],  [15],  [24],  [65]. 

^^Future  research  efforts  will  focus  on  implementations  of  Bayesian  Forest  inference  mechanisms  [53],  [4],  [56], 
[52],  [51],  [54]. 
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Thus,  the  decision  to  use  a  random  number  generator  in  the  initial  stages  of  database 
development  neither  adds  to  nor  detracts  from  the  Bayesian  Forest.  More  germane  to  the 
topic  at  hand,  it  certainly  has  no  impact  upon  the  consistency  of  the  data’s  logical 
organization  within  that  forest. 

These  menu  restrictions  only  account  for  the  simple  constraints.  The  more  involved 
Bayesian  Forest  formalisms  are  found  in  a  separate  consistency  checking  routine,  mack 
initiates  this  larger  routine  itself  after  any  change  to  the  set  of  support  conditions,  removal 
of  an  instantiation  or  upon  receipt  of  up  to  five  new,  unsupported  instantiations. 


Welcome  to  M.A.C.K.  --  the  Bayesian  Forest  Module  for  the 

Acquisition  of  Consistent  Knowledge!! 

0  -  Generate  new  Bayesian  Forest 

1  -  Edit  existing  Bayesian  Forest 

2  -  Display  current  Bayesian  Forest 

3  -  Load  Bayesian  Forest  from  file 

4  -  Save  Bayesian  Forest  to  file 

5  -  Check  Forest  Consistency 

6  -  Run  Bayesian  Forest  Belief  Revision  Program 

7  -  Delete  the  current  Bayesian  Forest 

8  -  Exit  Bayesian  Forest  program 


This  consistency  checking  routine  sampled  below  covers  the  four  remaining 
Bayesian  Forest  constraints  and,  as  a  courtesy,  also  notifies  the  user  of  any  conditions 
with  zero  probability.  Initially,  we  see  below  that  the  forest  has  failed  Constraint  1  since 
the  system  has  no  condition  defining  a  probability  value  for  the  absence  of  Anomaly  990 
Shift.  In  these  cases,  MACK  prompts  the  expert  appropriately.  Were  the  expert  to  answer 
any  of  these  negatively,  the  consistency  routine  aborts  there  and  returns  the  expert  to  the 
edit  menu. 
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This  Bayesian  Forest  is  currentiy  inconsistent, 
is  it  correct  that 

Anomaiy  990  Shift  being  Not  Found 
does  not  depend  on  anything  eise?  Y/N  v 

Piease  compiete  the  sentence  beiow  from  the  foiiowing  iist  of  choices: 

0  --  inconceivabie 

1  --  not  iikeiy 

2  --  possibie 

3  --  probabie 

4  --  aimost  certain 

it  is _ that  the  Anomaiy  990  Shift  is  Not  Found  depending 

upon  . . . 

Nothing! 

Choice:  3 


This  Bayesian  Forest  is  currentiy  inconsistent, 
is  it  correct  that 

P!D  990  Equiiibrium  being  Nomina! 
does  not  depend  on  anything  else?  Y/N  n 

Please  edit  the  conditions  for 

PID  990  Equilibrium  being  Nominal  accordingly. 

Growing  trees  for  your  Bayesian  Forest. 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 


Constraint  4  is  an  important  one  which  identifies  support  conditions  that  are  not 
mutually  exclusive.  With  insufficient  information  to  make  any  automatic  resolution 
assumptions  here,  MACK  again  queries  the  expert. 
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ERROR:  Support  conditions  beiow  are  not  mutuaiiy  exclusive. 

At  present,  Anomaiy  990  Shift’s  being  Found  depends  upon  the  foiiowing 
sets  of  conditions: 

Support  Node  #1 : 

PID  990  Equiiibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 
Support  Node  #2: 

PID  990  Peak  =  Out  of  Family  -  Low 
PID  990  Equilibrium  =  Nominal 
Support  Node  #3: 

PID  990  Equilibrium  =  Nominal 

This  Bayesian  Forest  is  currentiy  inconsistent. 

The  foiiowing  pair  of  conditions  for  Anomaiy  990  Shift  being  Found  are  not 
mutuaiiy  exclusive. 

First  Set: 

PiD  990  Peak  =  Out  of  Family  -  Low 
PID  990  Equilibrium  =  Nominal 
Second  Set: 

PID  990  Equilibrium  =  Nominal 

Does  Anomaly  990  Shift’s  being  Found 

really  depend  upon  both  sets  of  conditions?  [Enter  0] 
or 

upon  each  set  separately?  [Enter  1] 

Choice:  1 

Which  of  these  conditions  may  we  add  to  eliminate  the  overlap? 

1  --  PID  990  Peak  can  be  Nominal 

2  --  PID  990  Peak  can  be  Out  of  Family  --  High 
0  --  None  of  the  Above 

Choice:  2 


The  expert  is  given  the  option  either  of  merging  the  two  conditions  into  one  or  of 
distinguishing  them  in  some  way.  While  the  first  option  is  straightforward,^^  the  second 
could  conceivably  draw  upon  any  component  in  the  forest  except  the  one  in  question,  i.e., 


“This  merger  carmot  be  illegal,  i.e.,  violate  either  Constraint  6  or  7.  If  the  support  conditions  in  question  refer¬ 
ence  instantiations  which  will  conflict  when  merged,  then  they  are  mutually  exclusive  and,  therefore,  not  inconsistent. 
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the  head  of  the  two  support  conditions.  To  assist  the  expert  in  this  area,  MACK  makes  an 
initial  simplifying  assumption  that  excludes  all  random  variables  that  are  not  already 
present.  Since  the  only  way  to  establish  mutual  exclusion  is  for  both  of  the  support 
conditions  to  contain  in  their  tails  a  different  instantiation  of  one  or  more  variables  (see 
Chapter  2).  The  basis  of  the  assumption  is  that  at  least  one  of  the  current  random 
variables  can  be  expanded  to  meet  this  requirement,  thus  allowing  the  tool  automatically 
to  select  and  present  options  as  it  does  in  other  areas.  These  options  will  be  all  the  values 
of  the  existing  variables  which  are  not  already  represented.  MACK  can  easily  determine 
which  of  the  two  support  nodes  should  obtain  the  adjustment  since,  of  course.  Constraint 
7  which  proscribes  against  multiple  values  remains  in  effect. 

Verifications  of  Constraint  8,  shown  below,  occur  somewhat  innocuously.  Since 
the  expert  is  not  aware  of  the  actual  probabilistic  values  anyway,  mack  can  simply 
normalize  the  pertinent  sums^'*  and  reports  any  adjustments  of  the  support  conditions’ 
qualitative  variables — e.g.,  inconceivable,  not  likely,  possible,  probable,  or  almost 
certain — to  the  expert.  These  normalizations  always  use  the  original  probabilistic  range 
the  expert  assigned  in  order  to  avoid  a  new,  high-value  addition  from  overwhelming 
predecessors  whose  values  may  have  already  been  reduced. 

This  Bayesian  Forest  is  inconsistent. 

Currentiy,  support  ranges  overlap.  Adjusting  ranges  for  consistency  . . . 

Conditions  were: 


^‘'it  is  worth  noting  that  although  the  normalization  itself  may  be  a  trivial  operation,  the  determination  of  the 
support  node  sets  which  are  eligible  to  be  normalized  is  not.  Constraint  8’s  multi-tiered  mathematical  representation 
in  Appendix  A  is  a  testament  thereto. 
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It  is  probable  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon 

PiD  990  Equiiibrium  @  TJ  =  Out  of  Famiiy 

PID  990  Peak  @  TJ  =  Out  of  Family  --  High 

It  is  probabie  that  the  Anomaiy  990  Shift  @  TJ  is  Not  Found  depending 
upon  . . . 

Nothing! 

New  conditions  are: 

It  is  possible  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon 

PID  990  Equilibrium  @  TJ  =  Out  of  Family 

PID  990  Peak  @  TJ  =  Out  of  Family  -  High 

It  is  possible  that  the  Anomaly  990  Shift  @  TJ  is  Not  Found  depending 
upon  . . . 

Nothing! 

The  HPOTP  application  turned  out  to  be  a  rather  flat  forest.  By  that  we  mean  that 
the  sensor  readings  which  represent  the  bulk  of  the  random  variables  are  unconditioned, 
and  most  anomaly  determinations  depend  directly  on  the  sensors  rather  than  on  some 
intermediate  calculations.  As  a  result,  the  application  did  not  violate  Constraint  5  which 
searches  for  logical  cycles  in  the  knowledge  base. 

Acquiring  Temporal  Information 

Many  real-world  domains  require  the  capability  to  model  knowledge  that  changes 
over  time.  This  requirement  is  even  more  pronounced  in  the  PTDS  domain  which  NASA 
eventually  hopes  to  operate  in  real-time  [5].  For  a  knowledge  acquisition  tool  such  as 
MACK,  this  introduces  new  developmental  difficulties.  While  we  want  the  interface  with 
the  expert  to  remain  responsive  and  user-friendly,  we  must  of  course  maintain  the 
formalities  of  the  system’s  probabilistic  requirements. 
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Temporal  aspects  of  HPOTP  are  best  illustrated  by  the  fact  that  the  sensor  readings 
are  anything  but  static.  During  any  given  test,  a  sensor  may  take  on  multiple  values,  e.g., 
Nominal,  Erratic,  Spiked,  etc.  Clearly,  this  violates  the  Bayesian  Forest’s  requirement 
that  variable  instantiations  be  unique  and  mutually  exclusive.  However,  by  partitioning 
the  test  period  into  time  slices  we  can  accommodate  these  changes. 

For  purposes  of  mack’s  knowledge  acquisition,  the  tool  queries  the  expert  for  the 
number  of  changes  in  a  particular  temporal  variable  to  expect  at  run-time.  It  then  uses 
the  largest  of  these  when  creating  the  Bayesian  Forest.^^  Appendix  D  demonstrates  our 
method  of  parsing  these  changes  into  the  timeline.  Presently,  MACK  arbitrarily  limits  a 
support  node’s  relative  time  dependencies  to  the  current  time  period  and  either  adjacent 
period — i.e.,  the  ones  immediately  before  or  after  — however,  expansion  of  this  range 
to  ±  n  intervals,  if  necessary,  is  very  straightforward. 

The  MACK  interface’s  approach  to  the  time-dependencies  within  a  domain  follow 
the  similar  protocols  to  those  outlined  for  the  constraints  above.  The  distinctions  reside 
in  the  way  the  tool  queries  the  expert  for  the  temporal  dependency  of  each  new  random 
variable  and  the  special  menu  options  which  accommodate  those  variables  so  identified. 

Please  enter  the  name  of  the  new  component: 

Item  Name:  PIP  990  Peak 


PIP  990  Peak  is  a  new  component. 

New  INOPE  created:  PIP  990  Peak  =  No  value  given!! 


have  limited  ourselves  to  5  or  fewer  time  intervals  since  the  prototype  reasoner  used  in  support  of  this 
research  is  not  powerful  enough  to  handle  the  numbers  of  temporal  random  variables  generated. 

^®In  keeping  with  the  system’s  intended  flexibility,  these  qualitative  temporal  choices  are  "naturally  linguistic" 
based  on  Allen’s  results  [55],  [1]. 
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REMINDER: 

Bayesian  Forest  Variables  are  persistent  and  mutually  exclusive. 
In  other  words,  they  take  on  1 ,  and  only  1 ,  of  their  possible  values. 
Obviously,  this  will  not  accommodate  variables  that  change  over 
time. 

Can  the  value  of  PID  990  Peak  vary  with  time? 

Y/N  _y_ 

How  many  times  might  PID  990  Peak  change?  5 


When  these  temporal  components  are  encountered  during  forest  construction,  MACK 
again  queries  the  expert  for  the  appropriate  relative  time  dependencies  of  the  support 
condition.  We  have  now  included  temporal  conditions  into  our  previous  example  shown 
below.  The  "<  UNDEFINED  >"  linguistic  variable  is  a  placeholder  pending  the  expert’s 
assignment  of  a  legitimate  probabilistic  range  after  entering  all  the  tail  values. 


At  present.  Anomaly  990  Shift’s  being  Found  depends  upon  the  following 
sets  of  conditions: 

Support  Node  #1: 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 

Enter  0  to  add  new  support  conditions  for 
Anomaly  990  Shift’s  being  Found 
Otherwise,  enter  1  to  quit 
0 

Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  None  of  the  Above  Components 

Choice:  3 
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1  --  Nominal 

2  --  Out  of  Family  --  High 

3  --  Out  of  Famiiy  --  Low 

0  --  None  of  the  Above;  Abort 

Choice:  3 

It  is  <  UNDEFiNED  >  that  the  Anomaly  990  Shift  @  TJ  is  Found 
depending  upon . . . 

PiD  990  Peak  @  TJ  =  Out  of  Family  --  Low 

New  addition,  PiD  990  Peak,  is  time-dependent. 

We  should  read  its  value,  Out  of  Family  -  Low,  from  which  time  interval? 
-1  -  Time  period  immediately  preceding  Anomaiy  990  Shift 
=  Found 

0  -  The  same  time  period  as  Anomaiy  990  Shift  =  Found 

1  -  Time  period  immediately  after  Anomaly  990  Shift  = 

Found 

0 

Presently,  this  condition  hoids  that  Anomaly  990  Shift’s  being  Found  can 
depend  upon  the  following: 

PID  990  Peak  =  Out  of  Family  -  Low 

Do  you  wish  to  extend  this  condition?  Y  /  N  v 

Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 

2  -  PID  990  Equilibrium 
Choice:  2 

1  -  Nominai 

2  -  Out  of  Family 

0  -  None  of  the  Above;  Abort 


Choice:  1 

It  is  <  UNDEFINED  >  that  the  Anomaly  990  Shift  @  TJ  is  Found 
depending  upon  . . . 

PiD  990  Peak  @  TJ  =  Out  of  Family  ~  Low 
PID  990  Equilibrium  @  TJ  =  Nominal 


New  addition,  PID  990  Equiiibrium,  is  time-dependent. 

We  shouid  read  its  vaiue,  Nominal,  from  which  time  intervai? 

-1  -  Time  period  immediateiy  preceding  Anomaiy  990  Shift 
=  Found 

0  -  The  same  time  period  as  Anomaiy  990  Shift  =  Found 

1  ~  Time  period  immediateiy  after  Anomaiy  990  Shift  = 

Found 


Presently,  this  condition  hoids  that  Anomaly  990  Shift’s  being  Found  can 
depend  upon  the  foiiowing: 

PID  990  Peak  @  TJ  =  Out  of  Family  -  Low 
PID  990  Equilibrium  @  TJ-1  =  Nominai 

Do  you  wish  to  extend  this  condition?  Y  /  N  n 


Piease  compiete  the  sentence  below  from  the  following  list  of  choices: 

0  -  inconceivabie 

1  -  not  iikeiy 

2  -  possibie 

3  -  probabie 

4  -  almost  certain 

It  is _ that  the  Anomaly  990  Shift  @  TJ  is  Found  depending 

upon  . . . 

PID  990  Peak  @  TJ  =  Out  of  Family  -  Low 
PID  990  Equilibrium  @  TJ-1  =  Nominai 


Reasoning  with  Defaults 

Default  values  for  a  Bayesian  Forest’s  random  variables  become  important  in  cases 
where  the  system  has  incomplete  information  in  the  knowledge  base  with  which  to  reason. 
These  can  include  instances  where  the  expert  excludes  pertinent  data  points,  intentionally 
or  otherwise,  or  instances  wherein  his/her  implicit  assumptions  about  the  domain  are  not 
explicitly  entered  into  the  program.^’ 


^^The  former  type  of  incomplete  information  represents  necessary  knowledge  which  the  expert  neglected  to 
incorporate,  while  the  latter  is  information  which  by  virtue  of  expertise  s/he  considers  self-evident.  In  either  case 
if  the  missing  data  causes  the  inference  engine  to  assume  a  value,  the  system  will  highlight  the  deficiency  to  the 
expert. 
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When  completed,  the  peski  architecture  will  handle  these  situations  via  the 
Explanation  and  Interpretation  Facility  (See  Figure  1.2)  which  will  present  the  system’s 
output  to  the  human  expert  to  ensure  the  correctness  both  of  the  inference  engine’s  results 
and  the  logical  choices  made  to  arrive  at  those  results.  If  the  value  of  a  random  variable 
is  assumed  in  order  to  reach  a  solution,  that  variable  must  be  flagged  to  the  user.  In  the 
absence  of  essential  information,  these  explanations  will  very  likely  contain  assumptions 
of  which  the  expert  may  approve  or  disapprove.  In  either  case  they  will  highlight  the 
system’s  lack  of  information  prompting  the  expert  for  its  inclusion. 

However,  peski  is  still  under  development  as  evidenced  by  our  own  work  here  on 
its  knowledge  acquisition  facility.  In  the  interim  we  have  developed  a  prototype  inference 
engine  on  which  to  test  and  validate  mack’s  acquired  knowledge.  In  addition,  this 
reasoner  exercises  the  tool’s  ability  to  manipulate  default  data  in  an  incomplete 
knowledge  base  while  maintaining  the  required  constraints  of  probability  theory.  We 
accomplish  this  by  modelling  incomplete  information  about  a  random  variable  with 
probabilistic  sums  that  are  strictly  less  than  1.^®  There  are  two  very  distinct  approaches, 
then,  by  which  to  account  for  the  deficit  [40]:  we  can  define  default  values  as  supersets 
of  the  given  support  conditions  or  as  the  negation  of  those  same  rules. 

The  superset  approach  absorbs  the  probability  values  of  existing  rules,  if  any,  for 
the  default  instantiation  into  a  larger  set  which  includes  them  and  all  the  undefined 
probability  values,  i.e.,  that  fraction  which  brings  the  overall  sum  up  to  1.  In  this  case. 


“As  we  have  seen  in  Chapter  2,  the  sum  of  probabilities  for  compatible  instantiations  must  not  exceed  1. 
However,  Bayesian  Forests  have  no  requirement  for  the  support  conditions’  sums  always  to  equal  1  either. 
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we  assume  the  default  to  be  the  most  common  occurrence  and  the  rules  to  be  identified 


exceptions  thereto.  "Default  as  negation"  instead  of  adjusting  the  support  conditions 
entered  by  the  expert,  simply  creates  an  assumption  node  with  a  probability  value  which 
completely  or  partially  complements  the  others,  i.e.,  its  addition  to  the  set  maintains  a 
sum  less  than  or  equal  to  1.^®  Unlike  the  superset  default,  this  assumption  node  can  be 
activated  if  and  only  if  all  other  support  conditions  fail.  Now,  the  default — i.e.,  the 
assumption  node — ^has  become  the  exception  and  the  given  rules  constitute  the  common 
cases. 

For  an  example  let’s  return  to  the  simple  traffic  light.  Suppose  each  of  the  signal’s 
three  states — ^red  (the  default  state),  yellow  and  green — each  has  a  probability  of  0.3  with 
the  remaining  0.1  unassigned.  Superset  default  logic  would  absorb  the  uncounted  0.1  into 
the  default’s  value  such  that  red’s  value  increases  to  0.4.  However,  it  masks  the  original, 
defined  condition  for  the  light’s  being  red  which  may  contain  important  information  about 
the  "red"  state.  The  traffic  light  is  simply  assumed  to  be  red  unless  the  conditions  for 
yellow  or  green  are  satisfied. 

Negation,  on  the  other  hand,  does  not  change  any  of  the  known  data.  Instead,  it 
places  part  of  the  0.1  value  into  an  Assumption  node.  Then,  if  and  only  if  no  existing 
support  condition  for  the  traffic  signal  can  be  satisfied  and  the  signal’s  value  is  pertinent 
to  the  solution  being  generated  will  the  inference  engine  assume  a  value  for  the  light. 


^^The  alternative  to  this  assumption  node  is  completely  to  specify  all  the  possible  instantiations  that  have  not  yet 
been  enumerated  and  then  assign  explicit  values  to  each.  Needless  to  say,  this  solution  paves  the  way  for  a 
combinatorial  explosion  in  the  required  number  of  such  instantiations,  many  of  which  do  not  serve  to  enhance  the 
knowledge  base  since  we  can  assume  the  expert  would  have  included  them  if  they  did. 
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Notice  that  this  assumed  value  need  not  be  red.  It  will  instead  be  the  value  that  best 


supports  the  current  solution,  red  or  otherwise. 

In  a  probabilistic  system  the  primary  concern  for  default  reasoning  is  the  continued 
adherence  to  the  laws  of  probability.  Clearly,  the  superset  approach  does  this,  however, 
in  the  process  it  can  also  give  too  much  credence  to  its  default  value,  and  possibly 
remove  seminal  knowledge  established  by  the  expert.  The  resultant  system  runs  the  risks 
of  choosing  the  default  value  more  often  than  is  warranted  solely  because  of  its  artificially 
higher  probability.  Likewise,  "default  as  negation"  also  maintains  probabilistic  validity, 
but  without  the  artificial  inflation  of  values  as  the  assumption  node  is  kept  separate  from 
all  others.  Its  value  may  increase,  but  only  with  the  express  approval  of  the  expert. 
Nevertheless,  this  latter  approach  can  also  become  preferential  to  its  previous  assumptions 
in  those  cases  where  the  defined  probabilities  all  have  small  values. 

PESKI  is  designed  to  take  the  negation  approach,  thus  MACK  (and  its  proto-reasoner) 
support  this  position.  Initially,  we  defend  against  overreliance  upon  the  default  by 
guaranteeing  the  probability  of  the  assumption  node  to  be  one  or  more  orders  of 
magnitude  lower  than  any  legitimate  assignment.^”  However,  during  feedback  the  expert 
may  give  her/his  permission  to  incorporate  an  assumption  as  an  actual  default  support 
condition  in  its  own  right.  In  this  case,  we  increase  its  value  according  to  the  probability 


^°The  prototype  inference  engine  operates  using  integer  linear  programming  and  a  variant  of  the  simplex 
algorithm  [72],  [30].  In  this  model  the  assumption  node  always  has  the  extreme  value,  M — "Big  M"  [52].  As  a 
result,  any  calculable  state  of  the  world  which  does  not  use  this  value  will  always  produce  a  better  solution. 
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range  the  expert  assigns  and  continue  to  reason  normally.^^  Constraint  8  (see  Chapter 
2)  will  then  ensure  the  probabilities’  sum  is  less  than  1.  Also,  since  the  head  of  this 
default  support  condition  can  duplicate  an  existing  instantiation  but  will  not  have  a  tail 
of  its  own,  we  must  also  bypass  Constraint  4’s  requirement  for  uniqueness.  Specifically, 
a  default  support  node  is,  by  definition,  mutually  exclusive  with  the  set  of  all  others,  thus 
it  must,  of  course,  be  incompatible  with  any  individual  member  of  that  set. 

In  diagnostic  domains  such  as  HPOTP,  we  are  further  protected  from  overreliance 
on  a  default  since  the  sensor  readings  are  given  for  any  test.  Deduction  on  these  readings 
will  either  support  the  possibility  of  an  anomaly  as  defined  by  the  experts  [7],  [48]  or  not. 
Assumptions  of  default  values  are  much  more  common  in  abductive  systems  [50]  which 
would  receive  as  input  the  presence  of  an  anomaly  and  then  would  attempt  to  posit  the 
most  likely  set  of  evidence,  in  this  case  sensor  readings,  which  might  have  caused  that 
anomaly  to  occur. 

We  choose  here  an  example  of  this  default  dilemma  from  the  PTDS  domain. 
Sensors  #327  and  #328  measure  pressure  levels  on  the  turbopump’s  balance  piston.  They 
can  report  any  of  the  following  conditions:  Nominal,  Level  Shift  ±,  or  Spike  The 
difference  between  these  two  sensors’  readings  is  itself  meaningful:  "Delta  Level  Shift 
327-328."  This  difference  can  be  "Positive,"  "Negative"  or  "Zero"  where  a  non-zero 
value  indicates  a  change  in  the  sensors’  relative  values  and  "Zero"  represents  no  such 

^^Although  termed  the  exception,  the  probability  value  of  the  default  support  condition  could,  in  fact,  be  quite 
high  relative  to  the  its  sibling  nodes.  To  the  inference  engine  there  is  no  distinguishing  characteristic  of  a  default 
support  node.  It  is  a  value  like  any  other  available  for  computation. 

^^See  Appendix  D  for  graphical  example  of  these  values  from  the  sensor  data. 
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change.  Input  data  from  the  experts  include  the  following  four  support  conditions 
governing  "Delta  Level  Shift  327-328"  [7],  [43]: 

It  is  probable  that  Delta  Level  Shift  327-328  is  Positive  depending  upon 

PID  327  =  Level  Shift  + 

PID  328  =  Level  Shift  - 

It  is  probable  that  Delta  Level  Shift  327-328  is  Negative  depending  upon 

PID  327  =  Level  Shift  - 
PID  328  =  Level  Shift  + 

It  is  probable  that  Delta  Level  Shift  327-328  is  Zero  depending  upon  .  .  . 

PID  327  =  Level  Shift  - 
PID  328  =  Level  Shift  - 

It  is  probable  that  Delta  Level  Shift  327-328  is  Zero  depending  upon  . . . 

PID  327  =  Level  Shift  + 

PID  328  =  Level  Shift  + 

This  random  variable  highlights  the  difficulties  of  reasoning  with  defaults  as  there 
exist  no  explicit  support  conditions  to  define  Delta  Level  Shift’s  value  under  any  other 
circumstances,  e.g.,  those  when  either  sensor  is  spiking  up  or  down  or  both  are.  It  is 
precisely  this  situation  which  occurs  as  the  reasoner  attempts  to  determine  the  likelihood 
of  "Anomaly  5.06.1".^^  A  "Spike"  value  in  either  sensor  and  "Zero"  for  Delta  Level 
Shift  are  the  prerequisites  for  this  anomaly,  but  the  value  of  Delta  Level  Shift  cannot  be 
inferred  in  this  example  unless  both  sensors  register  a  level  shift. 


^^"Spike  seen  in  sensor  327  or  328  only,  with  no  change  in  steady  state  pressures  or  pressure  difference  indicates 
no  real  rotor  motion  and  possible  anomaly  in  omni  seal  or  sensor  itself."  [7] 
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A  "default  as  superset"  assumption  that  "Delta  Level  Shift"  has  value  "Zero"  breaks 
the  logjam,  but  also  runs  afoul  the  expert’s  determination  that  "Delta  Level  Shift  should 
not  be  zero"  [7]  except  in  certain  cases.  Using  the  Bayesian  Forest’s  "default  as 
negation"  approach,  the  inference  engine  assumes  the  necessary  value  to  allow  its 
reasoning  to  continue. 
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5.  CONCLUSIONS 


This  research  develops  a  viable  knowledge  acquisition  and  maintenance  tool  and 
its  associated  methodology  which  together  implement  the  new  Bayesian  Forest  knowledge 
model.  This  new  tool,  mack,  guarantees  the  consistency  of  the  data  stored  in  a  Bayesian 
Forest’s  knowledge  base  as  it  is  both  acquired  and  later  maintained.  Moreover,  this  tool 
has  been  applied  to  a  real-world  domain — ^NASA’s  Post-Test  Diagnostic  System — which 
supports  Space  Shuttle  main  engine  analysis. 

MACK,  which  is  implemented  on  an  explicit  object-oriented  analytical  foundation, 
contains  routines  designed  automatically  and  incrementally  to  confirm  the  consistency  of 
the  knowledge  being  received  from  the  expert  and  provide  him/her  with  natural  assistance 
in  the  transfer  of  knowledge.  Regular  incremental  checks  preserve  both  probabilistic 
validity  and  logical  consistency  by  flagging  the  inconsistent  data  points  to  the  expert  as 
they  are  entered  and  presumably  under  his/her  current  consideration.  Such  checking 
guards  against  expert  oversight — e.g.,  the  "Whoops!  I  forgot  to  run  the  consistency 
checker  again!"  phenomenon — and  helps  prevent  information  overload  since  there  can  be 
at  most  five  adjustments  required  of  the  expert  immediately  after  any  given  run  of  the 
consistency  checking  module.^'^ 


^''The  consistency  checking  routine  runs  after  each  addition,  removal  or  edit  of  any  support  node,  following  the 
removal  an  instantiation  node,  and  after  receipt  of  the  fifth  consecutive  instantiation. 


5.1 


The  tool  is  able  to  accept  and  manipulate  time-dependent  data  which  is  both 
common  and  required  not  only  in  the  PTDS  domain  modelled  herein,  but  in  many  other 
real-world  applications  as  well.  Moreover,  this  capability  will  prove  crucial  to  any 
eventual  efforts  to  operate  a  Bayesian  Forest  inferencing  mechanism  in  real-time  or  near 
real-time.  In  addition,  we  have  determined  the  Bayesian  Forest’s  available  methods  for 
dealing  with  the  incomplete  information  which  grant  it  its  flexibility  as  a  knowledge 
representation.  "Default  as  negation"  is  the  preferred  mechanism  as  it  preserves  all  of  the 
rules  and  data  interactions  expressly  catalogued  by  the  expert  and  the  ability  of  the 
forthcoming  Explanation  &  Interpretation  Facility  to  explain  the  system’s  results.  This 
work  breaks  ground  by  integrating  aspects  of  three  disparate  reasoning 
schemes — ^probabilistic  reasoning,  temporal  reasoning,  reasoning  with  defaults — into  the 
Bayesian  Forest  model,  particularly  as  they  touch  upon  knowledge  acquisition. 

In  order  to  implement  the  MACK  tool  properly  to  guarantee  consistency  of  the 
knowledge,  we  had  to  formalize  the  notion  of  consistency  for  Bayesian  Forests  and  then 
determine  the  necessary  conditions  and  constraints.  The  constraints  ensure  the  proper 
relationships  between  the  forest’s  instantiation  and  support  nodes  are  in  force  at  all  times. 
This  includes  algorithms  both  to  detect  constraint  violations  and  to  facilitate  corrections.^^ 


interesting  challenge  in  the  enumeration  of  these  constraints  lay  in  the  development  of  the  mathematical 
specifications  from  which  they  were  built.  For  example.  Constraint  5’s  cycles  can  be  of  any  length  >  2  and  there 
are  a  combinatorial  number  of  support  node  permutations  for  these  cycles  based  on  Stirling  numbers  of  the  First  Kind 
[68],  [32].  Meanwhile,  the  groups  of  compatible  support  nodes  which  must  all  sum  to  less  than  1  in  Constraint  8 
are  also  of  indeterminate  size.  Automating  this  constraint  by  itself  required  separate  routines  to  find  and  define  five 
distinct  sets  of  three  of  the  different  object  types  that  make  up  a  forest  in  for  each  sum!  After  that,  normalization 
was  little  more  than  an  afterthought. 
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We  have  availed  ourselves  of  the  Bayesian  Forest’s  computational  efficiencies 
without  sacrificing  the  increased  structural  flexibility  they  afford  both  for  reasoning  with 
uncertainty  and  when  compared  to  other  Bayesian  inferencing  methods.  We  evidence  this 
by  the  construction  of  a  Bayesian  Forest  for  the  Post-Test  Diagnostics  System.  Continued 
work  in  the  application  domain  by  PESKI  and  Bayesian  Forest  researchers  promises  to 
facilitate  on-going  PTDS  knowledge  acquisition  within  NASA  as  well  as  to  provide  the 
agency  with  novel,  probabilistic  reasoning  alternatives. 

MACK  completes  PESKi’s  Knowledge  Acquisition  and  Maintenance  module  (see 
Figure  1.2).  With  that  foundation  established  by  this  work,  follow-on  research  will  now 
concentrate  on  the  development  of  the  other  components  of  the  PESKI  architecture,  namely 
the  Inference  Engine  and,  just  as  importantly,  the  Explanation  &  Interpretation  Facility. 
Future  directions  for  work  with  the  MACK  tool  include  its  application  to  a  new  domain, 
one  for  which  there  has  been  no  previous  knowledge  engineering,  and  development  of  a 
graphical  user  interface  better  to  interact  with  the  expert.  This  could  include,  but  is  not 
limited  to,  "point-and-click"  Bayesian  Forest  construction  and  on-screen  representations 
of  the  forest’s  inconsistencies  which  should  facilitate  the  expert’s  comprehension  of  the 
knowledge  model. 

Areas  of  particular  interest  for  the  inference  engine  will  be  its  abilities  to  reason 
with  defaults,  to  unify  time  and  uncertainty  for  diagnosis  [55],  and  to  function  over  large 
data  sets  using  techniques  such  as  genetic  algorithms  [56].  Each  of  these  areas  promises 
only  to  augment  Bayesian  Forests  which  are  already  a  powerful  knowledge  representation. 
The  feedback  module’s  efforts  should  focus  on  its  ability  to  explain  to  the  expert  the 
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assumptions  made  during  inferencing  and  efficiently  to  incorporate  the  expert’s  revisions 
into  the  knovi^ledge  base  via  MACK  or  other  means.  Ultimately,  this  module  must  be  able 
not  only  to  explain  results  at  the  expert’s  level,  but  also  to  interpret  for  the  end-user 
whose  expertise  will  be  minimal. 

Together  with  MACK,  these  components  and  the  eventual  Natural  Language 
Interface,  by  allowing  the  expert  to  enter  his/her  data,  conduct  verification  test  runs,  and 
receive  from  those  tests  an  explanation  detailed  enough  to  allow  the  expert  to  refine  and 
adjust  the  knowledge  base  appropriately,  will  establish  Bayesian  Forests  as  a  front-line 
computing  methodology  for  reasoning  under  uncertainty. 
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A.  BAYESIAN  FOREST  MATHEMATICAL  SPECIFICATION^" 


- Component - 

name:  seq  char 


—  Value - 

name:  seq  char 


—  All  Instantiations  - 

all_instantiations:  Component  Value 


p  INODE - 

Alljnstantiations 
component:  Component 
value:  Value 

(component,  value)  e  all  instantiations 


—  SptUst - 

entries:  CP  (INODE) 


—  All  Supports  - 

all  supports:  SptJList  INODE 


^®See  [26],  [66],  [46]  for  a  more  complete  description  of  the  Z  notation  used  in  this  appendix. 
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p  SPTNODE - 

All_Supports 
head:  INODE 
tail:  Spt_List 
probability:  R 
probability _range:  N 


(tail,  head)  e  all_supports 
probability  >  0.0 
probability  <1.0 
probability _range  <  4 


—  Bayesian  Forest  — - 

instantiations:  !P  (INODE) 
supports:  ‘P  (SPTNODE) 


V  i :  instantiations,  3  s  :  supports  •  /  =  s.head 

V  s  :  supports  *  (s.tail.entries  u  {s.head})  c  instantiations 

V  i,j  :  instantiations  •  ((i^  j)  a  (i.component  =  {.component))  =>  (Lvalue  ^ {.value) 

V  s,t :  supports  •  (s.head  =  t.head) 

(3  i,{:  instantiations  •  ((i.component  =  {.component)  a  (i  ^  {) 
A  (i  e  s.tail.entries)  a  ({  e  t.tail.entries))) 

V  A  :  IP  (SPTNODE),  3  Y  :  seq  A  • 

((#  Y  =  #  Aj 

A 

(V  s:  N,3  (n,s)  :  Y) 

A 

(3  (n,p),  (m,q)  :  Y  •  ((m  >  n)  a  (p.head  e  q.tail.entries)))) 

V  i,{  :  instantiations,  s  :  supports  • 

({i,{}  c  (s.tail.entries  u  {s.head}))  =>  (i.component  -A  {.component) 

V  s  :  supports  •  (\f  z  :  s.tail.entries  •  z.component  ^  s.head.component) 

V  c:  Component,  k:  instantiations  • 

3  0  :  O’  (SPTNODE);  :  P  (INODE);  T  :  P  (Component)  • 

(©  =  {x:  supports  I  ((c  =  x.head.component)  a  ((k  e  x.tail.entries) 

V  fV  1:  x.tail.entries,  k.component  ^  I.component)))}) 

('«!>  =  ('  U  y.tail.entries)  \  {k}) 

fr  =  {b:  Component  I  #({b}  <  O)  >  2}) 

A 

fW  =  {i:  O  I  i.component  &  T}) 

A 

ff#  =  #  n  A  (Q  e  A  (V  x:  T,  3  •  (x  =  y. component))) 

A 

^  V  g  €  fz;  ®  I  £2  n  z.tail.entries  =  (5} 


q.probability  >  1.0) 


B.  SELECTED  EXCERPTS  OF  KNOWLEDGE  ACQUISITION  INTERVIEWS 

This  appendix  juxtaposes  the  output  text  from  Aerojet  Propulsion  Division’s 
HPOTP  expert  system  and  the  Bayesian  Forest  support  node(s)  developed  in  this  work 
with  the  actual  input  from  the  expert  [7].  For  each  of  the  five  anomalies  highlighted 
herein,  we  identify  first  the  expertise,  followed  by  the  Aerojet  anomaly(-ies)  and  Bayesian 
Forest  support  conditions  in  that  order. 


Anomaly  5.06.1: 

o  Expert  Input  from  knowledge  acquisition  interview  transcripts: 

"Seeing  change  in  one,  [but]  not  the  other  probably  is  due  to  static  seal  in 
housing  or  pressure  shift  not  associated  with  real  rotor  motion.  It  probably 
is  not  a  sensor  problem." 

o  Anomaly  Report  Text: 

"Spike  seen  in  sensor  <327I328>  only,  with  no  change  in  steady  state 
pressures  or  pressure  difference.  Possible  sensor  or  omni  seal  anomaly. 
No  real  rotor  motion." 

o  Bayesian  Forest  Support  Conditions: 

•  It  is  probable  that  the  Anomaly  5.06.1  is  Found  depending  upon  .  .  . 

PID  327  =  Spike  - 
PID  328  =  Spike  0 

Delta  Level  Shift  for  PIDs  327  &  328  =  Zero 

•  It  is  probable  that  the  Anomaly  5.06.1  is  Found  depending  upon  .  .  . 

PID  327  =  Spike  + 

PID  328  =  Spike  0 

Delta  Level  Shift  for  PIDs  327  &  328  =  Zero 
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•  It  is  probable  that  the  Anomaly  5.06.1  is  Found  depending  upon  .  .  . 

PID  328  =  Spike  - 
PID  327  =  Spike  0 

Delta  Level  Shift  for  PIDs  327  &  328  =  Zero 

•  It  is  probable  that  the  Anomaly  5.06.1  is  Found  depending  upon  .  .  . 

PID  328  =  Spike  + 

PID  327  =  Spike  0 

Delta  Level  Shift  for  PIDs  327  &  328  =  Zero 


Anomaly  5.06.2: 

o  Expert  Input  from  knowledge  acquisition  interview  transcripts: 

"Level  change  in  one  and  not  in  the  other  ....  I  don’t  believe  we  can  see 
cup  seal/washer  failures  or  problems.  Changes  were  made  to  the  design 
to  eliminate  this  type  of  problem.  Also  it  is  highly  unlikely  that  a  piece 
(of  seal)  could  migrate  to  a  pressure  opening  an  effect  that  pressure.  I 
would  be  skeptical  that  it  is  a  cup  washer,  more  likely  it  is  an  omni  seal 
or  a  sensor  problem." 

o  Anomaly  Report  Text: 

"Level  shift  seen  in  sensor  <327I328>  only.  Possible  sensor  problem, 
omni  seal  leakage  problem.  No  real  rotor  motion." 

o  Bayesian  Forest  Support  Conditions: 

•  It  is  possible  that  the  Anomaly  5.06.2  is  Found  depending  upon  .  .  . 

PID  327  =  Level  Shift  - 
PID  328  =  Level  Shift  0 

•  It  is  possible  that  the  Anomaly  5.06.2  is  Found  depending  upon  .  .  . 

PID  327  =  Level  Shift  + 

PID  328  =  Level  Shift  0 

•  It  is  possible  that  the  Anomaly  5.06.2  is  Found  depending  upon  .  .  . 

PID  328  =  Level  Shift  - 
PID  327  =  Level  Shift  0 

•  It  is  possible  that  the  Anomaly  5.06.2  is  Found  depending  upon  .  .  . 

PID  328  =  Level  Shift  + 

PID  327  =  Level  Shift  0 
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Anomaly  5.06.4: 


o  Expert  Input  from  knowledge  acquisition  interview  transcripts: 

"See  the  delta  is  327-328  so  if  this  one  goes  up  and  this  one  goes  down 
than  this  should  go  up  and  its  possibly  anomalous  rotor  motion  considering 
this  is  constant  power  level." 

o  Anomaly  Report  Text: 

"Possible  HPOTP  anomalous  rotor  motion." 

o  Bayesian  Forest  Support  Conditions: 

•  It  is  possible  that  the  Anomaly  5.06.4  is  Found  depending  upon  .  .  . 

PID  327  =  Level  Shift  - 
PID  328  =  Level  Shift  + 

Delta  Level  Shift  for  PIDs  327  &  328  =  Negative 

•  It  is  possible  that  the  Anomaly  5.06.4  is  Found  depending  upon  .  .  . 

PID  327  =  Level  Shift  + 

PID  328  =  Level  Shift  - 

Delta  Level  Shift  for  PIDs  327  &  328  =  Positive 
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Anomalies  5.15.1  &  5.15.2: 


o  Expert  Input  from  knowledge  acquisition  interview  transcripts: 

"951  is  one  of  the  [sensors]  that  addresses  the  pressure  in  the  LOX  [liquid 
oxygen]  drain.  1187  is  the  temperature  that  I  showed  you  that  you  can 
pick  whether  it’s  a  new  pump  or  not.  95%  of  the  time  it  runs  at  160  [psia] 
upwards  to  450.  I  think  from  an  erratic  criteria  needs  to  be  able  to 
discriminate.  The  thing  that  bothers  me  here  is  that  951  could  be  erratic 
for  cause  and  it  would  not  necessarily  cause  1187  temperature  measure¬ 
ment  to  be  erratic.  So  should  not  try  to  couple  the  two  measurements. 
The  thing  to  do  would  be  to  classify  either  or  both  as  being  erratic  and  go 
from  there.  I  believe  that  I  would  do  951  like  we  did  the  other  pressures. 
Establish  the  data  base  and  you  compare  each  test  against  that  database. 
You  also  analyze  for  erratic  behavior  and  if  it  by  itself  falls  out  than  flag 
it.  The  temperature  of  the  erratic  test  is  appropriate.  There  are  two 
characteristics  and  one  or  the  other  is  always  there.  What  you  look  for  is 
something  different." 

o  Anomaly  Report  Text: 

"HPOTP  erratic  primary  pump  seal  drain  pressure  may  indicate  sensor 
problem  or  seal  anomaly.  No  effect  seen  in  drain  temperature." 

o  Bayesian  Forest  Support  Conditions: 

•  It  is  possible  that  the  Anomaly  5.15.1  is  Found  depending  upon  .  .  . 

PID  951/952/953  =  Erratic 
PID  1187  =  Nominal 

•  It  is  possible  that  the  Anomaly  5.15.1  is  Found  depending  upon  .  .  . 

PID  951/952/953  =  Spike 
PID  1187  =  Nominal 

•  It  is  possible  that  the  Anomaly  5.15.2  is  Found  depending  upon  .  .  . 

PID  1187  =  Erratic 

PID  951/952/953  =  Nominal 

•  It  is  possible  that  the  Anomaly  5.15.2  is  Found  depending  upon  .  .  . 

PID  1187  =  Spike 

PID  951/952/953  =  Nominal 
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C.  MACK  SESSION  TRANSCRIPT 

Welcome  to  M.A.C.K.  --  the  Bayesian  Forest  Module  for  the 

Acquisition  of  Consistent  Knowledge! 

0  -  Generate  new  Bayesian  Forest 

1  -  Edit  existing  Bayesian  Forest 

2  -  Display  current  Bayesian  Forest 

3  -  Load  Bayesian  Forest  from  file 

4  -  Save  Bayesian  Forest  to  file 

5  -  Check  Forest  Consistency 

6  -  Run  Bayesian  Forest  Belief  Revision  Program 

7  -  Delete  the  current  Bayesian  Forest 

8  -  Exit  Bayesian  Forest  program 

Choice:  0 


Growing  trees  for  your  Bayesian  Forest. 

Please  select  from  the  following  menu: 
Initializing  forest. 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  Instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 


Please  pick  a  component  to  instantiate: 

Empty  NameTable. 

0  -  Add  new  component 
-1  --  Abort 
Choice:  0 


Please  enter  the  name  of  the  new  component: 

Item  Name:  PIP  990  Peak 

PID  990  Peak  is  a  new  component. 
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New  INODE  created:  PID  990  Peak  =  No  value  given!! 

REMINDER: 

Bayesian  Forest  Variables  are  persistent  and  mutually  exclusive. 

In  other  words,  they  take  on  1 ,  and  only  1,  of  their  possible  values. 

Obviously,  this  will  not  accommodate  variables  that  change  over  time. 

Can  the  value  of  PID  990  Peak  vary  with  time?  Y  /  N  v 

How  many  times  might  PID  990  Peak  change?  5 

Would  you  like  to  add  another  component? 


Please  enter  the  name  of  the  new  component: 

Item  Name:  PID  990  Equilibrium 

PID  990  Equilibrium  is  a  new  component. 

New  INODE  created:  PID  990  Equilibrium  =  No  value  given!! 

REMINDER: 

Bayesian  Forest  Variables  are  persistent  and  mutually  exclusive. 

In  other  words,  they  take  on  1,  and  only  1,  of  their  possible  values. 

Obviously,  this  will  not  accommodate  variables  that  change  over  time. 

Can  the  value  of  PID  990  Equilibrium  vary  with  time?  Y  /  N  v 

How  many  times  might  PID  990  Equilibrium  change?  3 

Would  you  like  to  add  another  component? 


Please  enter  the  name  of  the  new  component: 

Item  Name:  Anomaly  990  Shift 

Anomaly  990  Shift  is  a  new  component. 

New  INODE  created:  Anomaly  990  Shift  =  No  value  given!! 

REMINDER: 

Bayesian  Forest  Variables  are  persistent  and  mutually  exclusive. 

In  other  words,  they  take  on  1,  and  only  1,  of  their  possible  values. 

Obviously,  this  will  not  accommodate  variables  that  change  over  time. 

Can  the  value  of  Anomaly  990  Shift  vary  with  time?  Y  /  N  v 

How  many  times  might  Anomaly  990  Shift  change?  5 

Would  you  like  to  add  another  component? 
n 


Please  pick  a  component  to  instantiate: 
1  --  Anomaly  990  Shift 
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2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  Add  new  component 
-1  --  Abort 
Choice:  3 


Creating  new  instantiation  for  PID  990  Peak 
1  --  PID  990  Peak  =  No  value  given!! 

Do  you  wish  to  enter  a  new  legal  value  for  PID  990  Peak  [Enter  0] 
or  choose  a  pre-existing  value  [Enter  1]? 

0 

Enter  the  new  value  for  PID  990  Peak: 

Value:  Nominal 

New  INODE  created:  PID  990  Peak  =  Nominal 
INODE  PID  990  Peak  =  No  value  given!!  deleted 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  2 


Please  pick  a  component: 

1  -  Anomaly  990  Shift 

2  -  PID  990  Equilibrium 

3  ~  PID  990  Peak 
0  -  Abort 

Choice:  3 


Please  pick  an  instantiation  of  PID  990  Peak: 

1  -  PID  990  Peak  =  Nominal 
0  -  Abort 
Choice:  1 

At  present,  the  PID  990  Peak’s  being  Nominal  depends  upon  the  following: 

At  present,  PID  990  Peak’s  being  Nominal  depends  upon  the  following  sets  of  conditions: 

No  support  conditions! 

Enter  0  to  add  new  support  conditions  for 
PID  990  Peak’s  being  Nominal 
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otherwise,  enter  1  to  quit 
0 

PID  990  Peak’s  being  Nominal 

can  depend  upon  which  of  the  following  components: 

1  -  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  None  of  the  Above  Components 
Choice:  0 


Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 

It  is _ that  the  PID  990  Peak  is  Nominal  depending  upon  .  . . 

Nothing! 

Choice:  2 


Presently,  this  condition 

It  is  virtually  impossible  that  the  Anomaly  990  Shift  is  No  value  given!!  depending  upon  . . . 
Nothing! 

is  _less_than_  inconceivable;  it’s  virtually  impossible!! 

If  this  is  inaccurate,  you  may  wish  to  edit  this  support  condition. 

Presently,  this  condition 

It  is  virtually  impossible  that  the  PID  990  Equilibrium  is  No  value  given!!  depending  upon  . . 
Nothing! 

is  _less_than_  inconceivable:  it’s  virtually  impossible!! 

If  this  is  inaccurate,  you  may  wish  to  edit  this  support  condition. 
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Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 


Please  pick  a  component  to  instantiate: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  Add  new  component 
-1  --  Abort 
Choice:  3 


Creating  new  instantiation  for  PID  990  Peak 
1  --  PID  990  Peak  =  Nominal 

Do  you  wish  to  enter  a  new  legal  value  for  PID  990  Peak  [Enter  0] 
or  choose  a  pre-existing  value  [Enter  1]? 

0 

Enter  the  new  value  for  PID  990  Peak: 

Value:  Out  of  Family  -  High 
New  INODE  created:  PID  990  Peak  =  Out  of  Family  -  High 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 


Please  pick  a  component  to  instantiate: 

1  -  Anomaly  990  Shift 

2  -  PID  990  Equilibrium 

3  ~  PID  990  Peak 
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0  --  Add  new  component 
-1  --  Abort 
Choice:  3 


Creating  new  instantiation  for  PID  990  Peak 

1  --  PID  990  Peak  =  Nominal 

2  --  PID  990  Peak  =  Out  of  Family  -  High 

Do  you  wish  to  enter  a  new  legai  vaiue  for  PID  990  Peak  [Enter  0] 
or  choose  a  pre-existing  vaiue  [Enter  1]? 

0 


Enter  the  new  vaiue  for  PID  990  Peak: 

Value:  Out  of  Family  --  Low 
New  INODE  created:  PID  990  Peak  =  Out  of  Family  -  Low 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Deiete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Deiete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 

Piease  pick  a  component  to  instantiate: 

1  -  Anomaly  990  Shift 

2  -  PID  990  Equilibrium 

3  ~  PID  990  Peak 

0  -  Add  new  component 
-1  -  Abort 
Choice:  2 


Creating  new  instantiation  for  PID  990  Equilibrium 
1  -  PID  990  Equilibrium  =  No  vaiue  given!! 

Do  you  wish  to  enter  a  new  legai  vaiue  for  PID  990  Equilibrium  [Enter  0] 
or  choose  a  pre-existing  vaiue  [Enter  1]? 

1 


Piease  instantiate  PID  990  Equilibrium  from  this  menu: 

1  -  Nominal 

2  -  Out  of  Family  -  High 

3  -  Out  of  Famiiy  -  Low 
0  ~  Abort 
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Choice:  1 


New  INODE  created:  PID  990  Equilibrium  =  Nominal 
INODE  PID  990  Equilibrium  =  No  value  given!!  deleted 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 


Please  pick  a  component  to  instantiate: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  Add  new  component 
-1  --  Abort 
Choice:  2 

Creating  new  instantiation  for  PID  990  Equilibrium 
1  -  PID  990  Equilibrium  =  Nominal 

Do  you  wish  to  enter  a  new  legal  value  for  PID  990  Equilibrium  [Enter  0] 
or  choose  a  pre-existing  value  [Enter  1]? 

0 


Enter  the  new  value  for  PID  990  Equilibrium: 

Value:  Out  of  Family 

New  INODE  created:  PID  990  Equilibrium  =  Out  of  Family 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 
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Please  pick  a  component  to  instantiate: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  Add  new  component 
-1  --  Abort 
Choice:  1 

Creating  new  instantiation  for  Anomaly  990  Shift 
1  --  Anomaly  990  Shift  =  No  value  given!! 

Do  you  wish  to  enter  a  new  legal  value  for  Anomaly  990  Shift  [Enter  0] 
or  choose  a  pre-existing  value  [Enter  1]? 

0 


Enter  the  new  value  for  Anomaly  990  Shift: 

Value:  Found 

New  INODE  created:  Anomaly  990  Shift  =  Found 
INODE  Anomaly  990  Shift  =  No  value  given!!  deleted 

This  Bayesian  Forest  is  currently  inconsistent. 

Is  it  correct  that 

Anomaly  990  Shift  being  Found 
does  not  depend  on  anything  else?  Y/N  n 

Please  edit  the  conditions  for 

Anomaly  990  Shift  being  Found  accordingly. 
Bayesian  Forest  fails  consistency  checking. 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  0 


Please  pick  a  component  to  instantiate: 

1  -  Anomaly  990  Shift 

2  -  PID  990  Equilibrium 

3  ~  PID  990  Peak 

0  -  Add  new  component 
-1  -  Abort 
Choice:  1 
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Creating  new  instantiation  for  Anomaiy  990  Shift 
1  --  Anomaly  990  Shift  =  Found 

Do  you  wish  to  enter  a  new  iegai  value  for  Anomaly  990  Shift  [Enter  0] 
or  choose  a  pre-existing  vaiue  [Enter  1]? 

0 


Enter  the  new  vaiue  for  Anomaiy  990  Shift: 

Vaiue:  Not  Found 

New  INODE  created:  Anomaly  990  Shift  =  Not  Found 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  2 

Please  pick  a  component: 

1  -  Anomaly  990  Shift 

2  -  PID  990  Equilibrium 

3  -  PID  990  Peak 
0  “  Abort 

Choice:  1 

Please  pick  an  instantiation  of  Anomaly  990  Shift: 

1  -  Anomaly  990  Shift  =  Found 

2  -  Anomaly  990  Shift  =  Not  Found 
0  -  Abort 

Choice:  1 

At  present,  the  Anomaly  990  Shift’s  being  Found  depends  upon  the  following: 

At  present,  Anomaly  990  Shift’s  being  Found  depends  upon  the  following  sets  of  conditions: 

No  support  conditions! 

Enter  0  to  add  new  support  conditions  for 
Anomaly  990  Shift’s  being  Found 
Otherwise,  enter  1  to  quit 
0 


Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 
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1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PiD  990  Peak 

0  --  None  of  the  Above  Components 
Choice:  2 


1  --  Nominai 

2  --  Out  of  Family 

0  --  None  of  the  Above;  Abort 
Choice:  2 


It  is  <  UNDEFiNED  >  that  the  Anomaiy  990  Shift  @  TJ  is  Found  depending  upon 
PID  990  Equilibrium  @  TJ  =  Out  of  Family 

New  addition,  PID  990  Equilibrium,  is  time-dependent. 

We  should  read  its  vaiue.  Out  of  Family,  from  which  time  intervai? 

-1  “  Time  period  immediateiy  preceding  Anomaiy  990  Shift  =  Found 
0  -  The  same  time  period  as  Anomaly  990  Shift  =  Found 
1  -  Time  period  immediateiy  after  Anomaly  990  Shift  =  Found 


Presently,  this  condition  hoids  that  Anomaly  990  Shift’s  being  Found 
can  depend  upon  the  following: 

PID  990  Equilibrium  =  Out  of  Famiiy 

Do  you  wish  to  extend  this  condition?  Y  /  N  v 

Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  foliowing  components: 

3  -  PiD  990  Peak 
Choice:  3 


1  -  Nominal 

2  -  Out  of  Family  -  High 

3  -  Out  of  Famiiy  -  Low 

0  -  None  of  the  Above;  Abort 
Choice:  2 


It  is  <  UNDEFINED  >  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon 
PiD  990  Equilibrium  @  TJ  =  Out  of  Family 
PID  990  Peak  @  TJ  =  Out  of  Family  -  High 

New  addition,  PID  990  Peak,  is  time-dependent. 

We  shouid  read  its  value.  Out  of  Family  -  High,  from  which  time  intervai? 

-1  -  Time  period  immediateiy  preceding  Anomaiy  990  Shift  =  Found 
0  -  The  same  time  period  as  Anomaly  990  Shift  =  Found 
1  -  Time  period  immediateiy  after  Anomaly  990  Shift  =  Found 

0 
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Presently,  this  condition  holds  that  Anomaly  990  Shift’s  being  Found 
can  depend  upon  the  following: 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 

Do  you  wish  to  extend  this  condition?  Y  /  N  n 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 

It  is _ that  the  Anomaly  990  Shift  is  Found  depending  upon  . . . 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 
Choice:  3 

This  Bayesian  Forest  is  currently  inconsistent. 

Is  it  correct  that 

Anomaly  990  Shift  being  Not  Found 
does  not  depend  on  anything  else?  Y/N  v 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 

It  is _ that  the  Anomaly  990  Shift  is  Not  Found  depending  upon  . . 

Nothing! 

Choice:  3 


This  Bayesian  Forest  is  currently  inconsistent. 

Is  it  correct  that 

PID  990  Equilibrium  being  Nominal 
does  not  depend  on  anything  else?  Y/N  v 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 
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that  the  PID  990  Equilibrium  is  Nominal  depending  upon  . . . 


It  is _ 

Nothing! 

Choice:  3 

This  Bayesian  Forest  is  currently  inconsistent. 

Is  it  correct  that 

PID  990  Equilibrium  being  Out  of  Family 
does  not  depend  on  anything  else?  Y/N  v 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 

It  is _ that  the  PID  990  Equilibrium  is  Out  of  Family  depending  upon  . 

Nothing! 

Choice:  1 

This  Bayesian  Forest  is  currently  inconsistent. 

Is  it  correct  that 

PID  990  Peak  being  Out  of  Family  --  High 
does  not  depend  on  anything  else?  Y/N  v 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 

It  is _ that  the  PID  990  Peak  is  Out  of  Family  ~  High  depending  upon 

Nothing! 

Choice:  1 


This  Bayesian  Forest  is  currently  inconsistent. 

Is  it  correct  that 

PID  990  Peak  being  Out  of  Family  --  Low 
does  not  depend  on  anything  else?  Y/N  v 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  --  inconceivable 

1  --  not  likely 

2  --  possible 

3  --  probable 

4  --  almost  certain 
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It  is _ that  the  PID  990  Peak  is  Out  of  Family  -  Low  depending  upon  . . 

Nothing! 

Choice:  1 


This  Bayesian  Forest  is  inconsistent. 

Currently,  support  ranges  overlap.  Adjusting  ranges  for  consistency  ... 
Conditions  were: 

It  is  probable  that  the  Anomaly  990  Shift  @  T_i  is  Found  depending  upon  . . . 
PID  990  Equilibrium  @  TJ  =  Out  of  Family 
PID  990  Peak  @  TJ  =  Out  of  Family  -  High 

It  is  probable  that  the  Anomaly  990  Shift  is  Not  Found  depending  upon  . . . 
Nothing! 

New  conditions  are: 

It  is  possible  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon  . . . 
PID  990  Equilibrium  @  TJ  =  Out  of  Family 
PID  990  Peak  @  TJ  =  Out  of  Family  -  High 

It  is  possible  that  the  Anomaly  990  Shift  is  Not  Found  depending  upon  .  .  . 
Nothing! 

This  Bayesian  Forest  is  inconsistent. 

Currently,  support  ranges  overlap.  Adjusting  ranges  for  consistency  . . . 
Conditions  were: 

It  is  probable  that  the  PID  990  Equilibrium  is  Nominal  depending  upon  . . . 
Nothing! 

It  is  not  likely  that  the  PID  990  Equilibrium  is  Out  of  Family  depending  upon  .  . . 
Nothing! 

New  conditions  are: 

It  is  probable  that  the  PID  990  Equilibrium  is  Nominal  depending  upon  . .  . 
Nothing! 

It  is  not  likely  that  the  PID  990  Equilibrium  is  Out  of  Family  depending  upon  . . . 
Nothing! 

This  Bayesian  Forest  is  inconsistent. 

Currently,  support  ranges  overlap.  Adjusting  ranges  for  consistency  . . . 
Conditions  were: 


It  is  possible  that  the  PID  990  Peak  is  Nominal  depending  upon  . . . 

Nothingl 

It  is  not  likely  that  the  PID  990  Peak  is  Out  of  Family  -  High  depending  upon  .  . . 
Nothingl 
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It  is  not  likely  that  the  PID  990  Peak  is  Out  of  Family  -  Low  depending  upon  . . . 
Nothing! 

New  conditions  are: 

It  is  possible  that  the  PID  990  Peak  is  Nomina!  depending  upon  . . . 

Nothing! 

It  is  not  likely  that  the  PID  990  Peak  is  Out  of  Family  -  High  depending  upon  . . . 
Nothing! 

It  is  not  likely  that  the  PID  990  Peak  is  Out  of  Family  -  Low  depending  upon  . . . 
Nothing! 

Instantiations: 

0  -  Add  new  instantiation 

1  -  Delete  instantiation 

Support  Conditions: 

2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  2 

Please  pick  a  component: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 
0  --  Abort 

Choice:  1 


Please  pick  an  instantiation  of  Anomaly  990  Shift: 

1  --  Anomaly  990  Shift  =  Found 

2  --  Anomaly  990  Shift  =  Not  Found 
0  --  Abort 

Choice:  1 


At  present,  the  Anomaly  990  Shift’s  being  Found  depends  upon  the  following: 

At  present,  Anomaly  990  Shift’s  being  Found  depends  upon  the  following  sets  of  conditions: 
Support  Node  #1 : 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 

Enter  0  to  add  new  support  conditions  for 
Anomaly  990  Shift’s  being  Found 
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otherwise,  enter  1  to  quit 
0 


Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  None  of  the  Above  Components 
Choice:  3 


1  --  Nominal 

2  --  Out  of  Family  --  High 

3  --  Out  of  Family  --  Low 

0  --  None  of  the  Above;  Abort 
Choice:  3 


It  is  <  UNDEFINED  >  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon 
PID  990  Peak  @  TJ  =  Out  of  Family  -  Low 

New  addition,  PID  990  Peak,  is  time-dependent. 

We  should  read  its  value.  Out  of  Family  -  Low,  from  which  time  interval? 

-1  -  Time  period  immediately  preceding  Anomaly  990  Shift  =  Found 
0  -  The  same  time  period  as  Anomaly  990  Shift  =  Found 

1  “  Time  period  immediately  after  Anomaly  990  Shift  =  Found 

0 

Presently,  this  condition  holds  that  Anomaly  990  Shift’s  being  Found 
can  depend  upon  the  following: 

PID  990  Peak  =  Out  of  Family  -  Low 

Do  you  wish  to  extend  this  condition?  Y  /  N  v 

Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 

2  -  PID  990  Equilibrium 
Choice:  2 

1  ~  Nominal 

2  -  Out  of  Family 

0  -  None  of  the  Above;  Abort 
Choice:  1 

It  is  <  UNDEFINED  >  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon 
PID  990  Peak  @  TJ  =  Out  of  Family  -  Low 
PID  990  Equilibrium  @  TJ  =  Nominal 

New  addition,  PID  990  Equilibrium,  is  time-dependent. 

We  should  read  its  value.  Nominal,  from  which  time  interval? 
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-1  --  Time  period  immediateiy  preceding  Anomaiy  990  Shift  =  Found 
0  --  The  same  time  period  as  Anomaly  990  Shift  =  Found 
1  --  Time  period  immediateiy  after  Anomaly  990  Shift  =  Found 

0 

Presentiy,  this  condition  hoids  that  Anomaly  990  Shift’s  being  Found 
can  depend  upon  the  foiiowing: 

PID  990  Peak  =  Out  of  Famiiy  -  Low 
PID  990  Equiiibrium  =  Nominal 

Do  you  wish  to  extend  this  condition?  Y  /  N  n 

Piease  compiete  the  sentence  below  from  the  following  iist  of  choices: 

0  --  inconceivabie 

1  --  not  likeiy 

2  --  possibie 

3  --  probabie 

4  --  aimost  certain 

It  is _ that  the  Anomaiy  990  Shift  is  Found  depending  upon  . 

PID  990  Peak  =  Out  of  Family  -  Low 
PID  990  Equilibrium  =  Nominal 
Choice:  3 


Sum  of  the  probabilities  cannot  exceed  1 .0! 

This  Bayesian  Forest  is  inconsistent. 

Currentiy,  support  ranges  overiap.  Adjusting  ranges  for  consistency  . . . 
Conditions  were: 

It  is  probable  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon  . 
PiD  990  Peak  @  TJ  =  Out  of  Family  --  Low 
PID  990  Equilibrium  @  TJ  =  Nominal 

It  is  possibie  that  the  Anomaiy  990  Shift  is  Not  Found  depending  upon  . . . 
Nothing! 

New  conditions  are: 

it  is  possibie  that  the  Anomaiy  990  Shift  @  TJ  is  Found  depending  upon  . 
PiD  990  Peak  @  TJ  =  Out  of  Family  -  Low 
PID  990  Equilibrium  @  TJ  =  Nominai 

it  is  possibie  that  the  Anomaiy  990  Shift  is  Not  Found  depending  upon  . . . 
Nothing! 

Instantiations: 

0  -  Add  new  instantiation 
1  -  Deiete  instantiation 

Support  Conditions: 
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2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Delete  support  condition 

5  -  Return  to  main  menu 

Choice:  2 


Please  pick  a  component: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 
0  --  Abort 

Choice:  1 


Please  pick  an  instantiation  of  Anomaly  990  Shift: 

1  --  Anomaly  990  Shift  =  Found 

2  --  Anomaly  990  Shift  =  Not  Found 
0  --  Abort 

Choice:  1 


At  present,  the  Anomaly  990  Shift’s  being  Found  depends  upon  the  following: 

At  present,  Anomaly  990  Shift’s  being  Found  depends  upon  the  following  sets  of  conditions: 
Support  Node  #1 : 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 
Support  Node  #2: 

PID  990  Peak  =  Out  of  Family  -  Low 
PID  990  Equilibrium  =  Nominal 

Enter  0  to  add  new  support  conditions  for 
Anomaly  990  Shift’s  being  Found 
Otherwise,  enter  1  to  quit 
0 

Anomaly  990  Shift’s  being  Found 

can  depend  upon  which  of  the  following  components: 

1  --  Anomaly  990  Shift 

2  --  PID  990  Equilibrium 

3  -  PID  990  Peak 

0  --  None  of  the  Above  Components 
Choice:  2 


1  --  Nominal 

2  --  Out  of  Family 

0  --  None  of  the  Above;  Abort 
Choice:  1 
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It  is  <  UNDEFINED  >  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon  . . . 

PID  990  Equilibrium  @  TJ  =  Nominal 

New  addition,  PID  990  Equilibrium,  is  time-dependent. 

We  should  read  its  value.  Nominal,  from  which  time  interval? 

-1  -  Time  period  immediately  preceding  Anomaly  990  Shift  =  Found 
0  -  The  same  time  period  as  Anomaly  990  Shift  =  Found 
1  -  Time  period  immediately  after  Anomaly  990  Shift  =  Found 

0 

Presently,  this  condition  holds  that  Anomaly  990  Shift’s  being  Found 
can  depend  upon  the  following: 

PID  990  Equilibrium  =  Nominal 

Do  you  wish  to  extend  this  condition?  Y  /  N  n 

Please  complete  the  sentence  below  from  the  following  list  of  choices: 

0  -  inconceivable 

1  -  not  likely 

2  -  possible 

3  -  probable 

4  -  almost  certain 

It  is _ that  the  Anomaly  990  Shift  is  Found  depending  upon  . . . 

PID  990  Equilibrium  =  Nominal 
Choice:  2 

ERROR:  Support  conditions  below  are  not  mutually  exclusive. 

At  present.  Anomaly  990  Shift’s  being  Found  depends  upon  the  following  sets  of  conditions: 
Support  Node  #1 : 

PID  990  Equilibrium  =  Out  of  Family 
PID  990  Peak  =  Out  of  Family  -  High 
Support  Node  #2: 

PID  990  Peak  =  Out  of  Family  -  Low 
PID  990  Equilibrium  =  Nominal 
Support  Node  #3: 

PID  990  Equilibrium  =  Nominal 

This  Bayesian  Forest  is  currently  inconsistent. 

The  following  pair  of  conditions  for  Anomaly  990  Shift  being  Found 
are  not  mutually  exclusive. 

First  Set: 

PID  990  Peak  =  Out  of  Family  ~  Low 
PID  990  Equilibrium  =  Nominal 
Second  Set: 

PID  990  Equilibrium  =  Nominal 
Does  Anomaly  990  Shift’s  being  Found 
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really  depend  upon  _both_  sets  of  conditions?  [Enter  0] 
or 

upon  each  set  separately?  [Enter  1] 

Choice:  1 

Which  of  these  conditions  may  we  add  to  eliminate  the  overlap? 

1  --  PID  990  Peak  can  be  Nominal 

2  --  PID  990  Peak  can  be  Out  of  Family  --  High 

3  --  PiD  990  Equilibrium  can  be  Out  of  Famiiy 

4  --  PiD  990  Equilibrium  can  be  Out  of  Famiiy 
0  --  None  of  the  Above 

Choice:  2 


it  is  possible  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon  . 
PiD  990  Equilibrium  @  TJ  =  Nominal 
PID  990  Peak  @  TJ  =  Out  of  Family  --  High 

New  addition,  PID  990  Peak,  is  time-dependent. 

We  should  read  its  value.  Out  of  Family  -  High,  from  which  time  interval? 
-1  ~  Time  period  immediately  preceding  Anomaly  990  Shift  =  Found 
0  -  The  same  time  period  as  Anomaly  990  Shift  =  Found 
1  -  Time  period  immediately  after  Anomaly  990  Shift  =  Found 

0 


Sum  of  the  probabilities  cannot  exceed  1 .0! 

This  Bayesian  Forest  is  inconsistent. 

Currently,  support  ranges  overlap.  Adjusting  ranges  for  consistency  . . . 
Conditions  were: 

It  is  possible  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon  . 
PID  990  Equilibrium  @  TJ  =  Nominal 
PID  990  Peak  @  TJ  =  Out  of  Family  -  High 

It  is  possible  that  the  Anomaly  990  Shift  is  Not  Found  depending  upon  . . . 
Nothing! 

New  conditions  are: 

It  is  possible  that  the  Anomaly  990  Shift  @  TJ  is  Found  depending  upon  . 
PID  990  Equilibrium  @  TJ  =  Nominal 
PID  990  Peak  @  TJ  =  Out  of  Family  -  High 

It  is  possible  that  the  Anomaly  990  Shift  is  Not  Found  depending  upon  . . . 
Nothing! 

Instantiations: 

0  -  Add  new  instantiation 
1  -  Delete  instantiation 

Support  Conditions: 
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2  -  Add  new  support  condition 

3  -  Edit  existing  support  condition 

4  -  Deiete  support  condition 

5  -  Return  to  main  menu 

Choice:  5 

0  -  Generate  new  Bayesian  Forest 

1  -  Edit  existing  Bayesian  Forest 

2  -  Display  current  Bayesian  Forest 

3  -  Load  Bayesian  Forest  from  fiie 

4  -  Save  Bayesian  Forest  to  fiie 

5  -  Check  Forest  Consistency 

6  -  Run  Bayesian  Forest  Beiief  Revision  Program 

7  -  Deiete  the  current  Bayesian  Forest 

8  -  Exit  Bayesian  Forest  program 

Choice:  4 

Enter  fiiename:  output-test 
Forest  saved  to  fiie. 

0  -  Generate  new  Bayesian  Forest 

1  -  Edit  existing  Bayesian  Forest 

2  -  Dispiay  current  Bayesian  Forest 

3  -  Load  Bayesian  Forest  from  file 

4  -  Save  Bayesian  Forest  to  fiie 

5  -  Check  Forest  Consistency 

6  -  Run  Bayesian  Forest  Beiief  Revision  Program 

7  -  Deiete  the  current  Bayesian  Forest 

8  -  Exit  Bayesian  Forest  program 

Choice:  8 


Cutting  down  the  forest. 
Goodbye. 
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D.  TEMPORAL  PARSING 


This  appendix  outlines  the  methodology  [7]  by  which  a  test  run  is  parsed  into  the 
time  slices  discussed  in  Chapter  4.  At  present,  MACK  uses  a  similar  method,  but  this 
capability  is  neither  automated  nor  a  part  of  the  tool  itself.  Instead,  we  simply  preprocess 
the  temporal  data  by  hand  before  entering  it. 


Figure  D.l  Timeline  Parsing  Example 
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In  Figure  D.l  we  have  constructed  a  very  simplistic  example.  The  figure  contai 
ns  data  streams  from  three  hypothetical  sensors  over  a  ten-second  period  divided  into  one- 
second  intervals.  The  algorithm  assigns  breakpoints  in  the  timeline  at  the  beginning  and 
end  of  any  event — e.g.,  spikes,  peaks,  level  shifts — from  any  of  the  sensors.  The  result 
is  a  series  of  intervals  as  shown. 

Periods  1,  4,  6,  8  and  11  contain  no  significant  events,  i.e.,  all  sensors  are  nominal. 
Positive  spikes/peaks  occur  in  periods  2  and  3;  negative  spikes  in  7,  9  and  10.  We  see 
level  shifts  during  periods  5,  7  and  12,  and  erratic  readings  from  PID  1  in  the  last 
interval.  Notice  that  period  7,  originally  subdivided  into  three  by  the  spike  in  PID  2,  is 
unified  to  reflect  the  occurrence  of  a  single  level  shift  in  PID  3  rather  than  three  distinct 
shifts  which  would  be  recognizable  by  two  small  plateaux  during  the  decline. 

We  consider  events  which  start  within  one  second  of  each  other  to  be  concurrent. 
Thus,  the  spikes  in  periods  2  and  3  are  concurrent  as  are  those  in  periods  9  and  10.  PID 
3’s  level  shift  in  period  7  is,  per  force,  concurrent  with  the  spike  in  PID  2.  Similarly,  the 
shift  during  period  12  coincides  with  PID  I’s  erratic  behavior. 
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E.  GLOSSARY  of  TECHNICAL  TERMS 


Unless  otherwise  specified,  the  non-acronym  terms  and  definitions  contained  in  this 
glossary  have  been  excerpted  from  either  of  the  following  sources:  Rosenberg’s  Diction¬ 
ary  of  Artificial  Intelligence  and  Robotics  [44]  or  Smith’s  The  Facts  on  File  Dictionary 
of  Artificial  Intelligence  [63]. 


Abduction:  An  inference  process  that  generates  explanations.  To  a  first  approximation, 
it  uses  the  following  paradigm: 

Given:  b 

a  — >  b 
Infer:  a 

Unlike  deduction,  abduction  is  not  a  legal  inference.  However,  despite  the  fact  that 
abduction  can  lead  to  wrong  conclusions,  it  is  useful  and  often  necessary. 

Artificial  Intelligence  (AI):  the  capability  of  a  device  to  perform  functions  that  are 
normally  associated  with  human  intelligence  such  as  reasoning,  learning,  and  self- 
improvement. 

Automated  Reasoning:  See  Expert  system. 

Bayes’  Rule:  A  theorem  of  probability  that  say  if  one  knows,  for  example,  how  many 
patients  with  a  given  disease  show  a  particular  symptom,  as  well  as  the  likelihood  of  the 
disease  and  of  the  symptom,  then  you  can  calculate  the  conditional  probability  of  the 
disease  given  the  symptom. 
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Bayesian  Inference:  Also  known  as  statistical  inference,  one  means  by  which 

knowledge-based  systems  can  reason  when  uncertainty  is  involved.  Given  a  hypothesis 
event  H  and  an  evidence  event  E,  we  obtain  from  an  expert  estimates  of  the  prior 
probabilities  P(H)  and  P(E),  and  the  conditional  probability  P(EIH).  From  Bayes’  Rule 
we  obtain  the  probability  of  H  given  evidence  E.  In  practical  problems  E  may  be  any 
subset  of  all  possible  evidence  events  and  H  may  be  any  subset  of  the  set  of  all  possible 
hypotheses.  This  tends  to  require  a  vast  number  of  conditional  probabilities  to  be 
calculated.  An  alternative  approach  is  therefore  to  combine  with  a  rule-based  system 
where  conditional  probabilities  are  given  only  for  each  rule.  This  approach  is  adequate 
until  one  considers  the  local  updating  problem  that  arises  when  two  or  more  inference 
rules  make  a  conclusion  about  the  same  hypothesis. 

Causal  Analysis:  used  in  credit  (blame)  assignment  to  track  the  probable  causes  of 
observed  events. 

Causal  Model:  a  model  where  the  causal  relations  among  various  actions  and  events 
are  represented  explicitly. 

Certainty:  the  degree  of  confidence  one  has  in  a  fact  or  relationship.  As  used  in  AI, 
contrasts  with  probability,  which  is  the  likelihood  that  an  event  will  occur. 

Certainty  Factor:  a  numerical  weight  given  to  a  fact  or  relationship  indicating  the 
confidence  one  has  in  the  fact  or  relationship.  These  numbers  behave  differently  that 
probability  coefficients.  In  general,  methods  for  manipulating  certainty  factors  are  more 
informal  than  approaches  to  combining  probabilities.  Most  rule-based  systems  use 
certainty  factors  rather  than  probabilities.  Synonymous  with  Confidence  Factor. 

Deduction:  In  formal  logic  the  derivation  of  a  logical  consequence  from  a  specific  set 
of  premises;  a  truth-preserving  transformation  of  assertions. 

Default:  A  value  that  is  used  when  no  other  value  is  specified. 

Default  Reasoning  [61]:  Patterns  of  inference  that  permit  the  drawing  of  conclusions 
suggested  but  not  entailed  by  their  premises;  Plays  a  particular  role  in  two  kinds  of 
situations:  those  in  which  systems  must  reason  from  incomplete  information  and  those 
in  which  systems  must  reason  using  uncertain  rules. 

Domain:  a  topical  area  or  region  of  knowledge.  Medicine,  management,  science,  and 
engineering  are  very  broad  domains.  Existing  knowledge  systems  only  provide  competent 
advice  within  very  narrowly  defined  domains. 

Domain  Expert:  an  individual  who,  through  years  of  experience  and  training,  has 
become  extremely  skilled  at  problem  solving  in  a  specific  domain. 


E.2 


Domain  Knowledge:  knowledge  about  the  problem  domain,  e.g.,  knowledge  about 
geology  in  an  expert  system  for  discovering  oil  reserves. 

Expert  System:  (1)  a  computer  system  that  can  perform  at,  or  near,  the  level  of  a 
human  expert.  (2)  any  computer  system  developed  by  means  of  a  loose  collection  of 
techniques  associated  with  artificial  intelligence  research.  Therefore,  any  computer  system 
developed  by  means  of  an  expert  system  building  tool  (even  were  the  system  to  be  so 
narrowly  constrained  that  it  could  never  be  said  to  rival  a  human  expert).  (3)  a  computer 
program  that  performs  a  specialized,  usually  difficult  professional  task  at  the  level  of  (or 
sometimes  beyond  the  level  of)  a  human  expert.  Because  their  functioning  relies  heavily 
on  large  bodies  of  knowledge,  expert  systems  are  sometimes  known  as  knowledge-based 
systems.  Since  they  are  often  used  to  assist  the  human  expert,  they  are  also  known  as 
intelligent  assistants. 

Expert  System-Building  Tool:  the  programming  language  and  support  package  for 
building  an  expert  systems. 

Expertise:  skill  and  knowledge  possessed  by  humans  resulting  in  performance  far  above 
the  norm;  consists  of  massive  amounts  of  information  combined  with  rules  of  thumb, 
simplifications,  rare  facts,  and  wise  procedures  in  such  a  fashion  that  a  person  can  analyze 
specific  types  of  problems  in  an  efficient  way. 

Explanation:  information  presented  to  justify  a  specific  course  of  reasoning  or  action. 
In  knowledge  systems,  a  number  of  techniques  that  help  a  user  understand  what  a  system 
is  doing.  Many  knowledge  systems  permit  a  user  to  ask  "Why,"  "How,"  or  "Explain." 
In  each  case,  the  system  responds  by  revealing  something  about  its  assumptions  or  its 
inner  reasoning. 

Explanation  Facility:  the  portion  of  an  expert  system  that  explains  how  solutions  were 
arrived  at  and  justifies  the  steps  used  in  reaching  them;  keeps  track  of  the  reasoning  paths 
used  by  the  inference  engine  to  reach  its  conclusions  [53]. 

HPOTP:  High-Pressure  Oxidizer  Turbopump 

Inference:  a  process  by  which  new  facts  are  derived  from  known  facts. 

Inference  Chain:  the  sequence  of  steps  or  rule  applications  utilized  by  a  rule-based 
system  to  reach  a  conclusion. 

Inference  Engine:  That  portion  of  a  knowledge  system  containing  the  inference  and 
control  strategies;  includes  various  knowledge  acquisition,  explanation,  and  user  interface 
subsystems.  Inference  engines  are  characterized  by  the  inference  and  control  strategies 
they  use. 
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Instantiation:  A  pattern  or  formula  in  which  the  variables  have  been  replaced  by 

constants.  The  association  of  a  particular  individual  with  the  characteristics  of  some 
class,  or  the  assignment  of  particular  values  to  the  parameters  of  a  procedure. 

Integer  Linear  Programming  [30]:  The  general  problem  of  allocating  limited  resource 
among  competing  activities  in  the  best  possible,  or  optimal,  integer-valued  manner. 

Intelligent  system:  See  Expert  system. 

Knowledge:  an  integrated  collection  of  facts  and  relationships  which,  when  exercised, 
produces  competent  performance.  The  quantity  and  quality  of  knowledge  possessed  by 
an  individual  or  a  computer  can  be  judged  by  the  variety  of  situation  in  which  the 
individual  or  program  obtains  successful  results. 

Knowledge  Acquisition:  the  process  of  locating,  collecting,  and  refining  knowledge;  may 
require  interviews  with  experts,  research  in  a  library,  or  introspection.  The  individual 
doing  this  must  convert  the  acquired  knowledge  into  a  form  that  can  be  used  by  a 
computer  program. 

Knowledge  Base:  Facts,  assumptions,  beliefs,  heuristics,  and  expertise;  methods  of 
dealing  with  the  data  base  to  achieve  desired  results  such  as  a  diagnosis,  interpretation, 
or  solution  to  a  problem. 

Knowledge  Engineer:  An  individual  whose  specialty  is  assessing  problems,  acquiring 
knowledge,  and  building  expert  systems. 

Knowledge  Engineering:  The  discipline  of  designing  and  building  expert  systems  and 
other  knowledge-based  programs. 

Knowledge  Representation:  A  means  for  encoding  and  storing  facts  and  relationships 
in  a  knowledge  base.  Semantic  networks,  object-attribute-value  triplets,  production  rules, 
frames,  and  logical  expressions  are  all  ways  to  represent  knowledge. 

LOX:  Liquid  Oxygen 

MACK:  Module  for  the  Acquisition  of  Consistent  Knowledge.  A  knowledge  acquisition 
tool  designed  to  support  Bayesian  Forests. 

Natural  Language:  a  branch  of  artificial  intelligence  research  that  studies  methods 
permitting  computer  systems  to  accept  inputs  and  produce  outputs  in  a  conventional 
language  like  English. 
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NP  [21]:  The  class  of  decision  problems  that  can  be  solved  in  polynomial  time  by  a 
non-deterministic  coifiputer.  Most  of  the  apparently  intractable  problems  encountered  in 
practice,  when  phrased  as  decision  problems,  belong  to  this  class. 

NP-Complete  [21]:  The  equivalence  class  consisting  of  the  hardest  problems  in  NP. 

NP-Hard  [21]:  Any  decision  problem,  whether  a  member  of  NP  or  not,  to  which  we 
can  transform  an  NP-complete  problem  and,  as  a  result,  is  at  least  as  hard  as  the  NP- 
complete  problems. 

Object-Oriented  Techniques:  Programming  procedures  based  on  the  use  of  items  called 
objects  that  communicate  with  one  another  via  messages  in  the  form  of  global  broadcasts. 

PESKI  [53]:  Probabilities,  Expert  Systems,  Knowledge  and  Inference. 

Probability:  Various  approaches  to  statistical  inference  used  for  determining  the 
likelihood  of  a  particular  relationship.  Expert  systems  have  generally  avoided  probability 
and  used  confidence  or  certainty  factors  instead.  See  Certainty. 

Probability  Propagation:  The  adjusting  of  probabilities  at  the  nodes  in  an  inference  net 
accounting  for  the  effect  of  new  information  about  the  probability  at  a  specific  node. 

Probabilistic  Reasoning  [29]:  Mathematical  theories  based  on  probability  and  statistics 
are  used  to  accept  or  reject  proposed  hypotheses  and  to  draw  other  kinds  of  conclusions. 
These  theories  are  quite  well  developed  and  computable  in  a  straightforward  numerical 
way.  Thus,  it  is  the  design  of  the  hypotheses  and  the  use  to  which  the  conclusions  are 
put  that  has  more  to  do  with  AI  than  the  actual  method  of  reaching  the  conclusion. 

PTDS:  Post-Test  Diagnostics  System. 

Real-Time:  Pertaining  to  an  application  in  which  response  to  input  is  fast  enough  to 
affect  subsequent  input,  such  as  a  process  control  system  or  a  computer-assisted 
instruction  system. 

Real-World  Problem:  A  complex,  practical  problem  having  a  solution  that  is  useful  in 
some  cost-effective  fashion. 

Reasoner:  See  Inference  Engine. 

Simplex  Method  [72],  [30]:  A  general  algorithm  for  solving  linear  programming 
problems  developed  by  George  Dantzig  in  1947. 

SSME:  Space  Shuttle  Main  Engine 
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Stirling  Numbers  of  the  First  Kind  [68]:  Related  to  binomial  coefficients,  these  count 
the  number  of  ways  to  arrange  n  objects  into  k  cycles. 

Stirling  Numbers  of  the  Second  Kind  [68]:  Related  to  binomial  coefficients,  these 
count  the  number  of  ways  to  partition  a  set  of  n  objects  into  k  non-empty  sets. 

Tool:  Computer  software  package  that  simplifies  the  effort  involve  din  building  an 

expert  system;  contains  an  inference  engine  and  various  user  interface  and  knowledge 
acquisition  aids,  and  lack  of  knowledge  base. 

Uncertainty:  With  expert  systems,  a  value  that  cannot  be  determined  during  a 

consultation.  Most  expert  systems  can  accommodate  uncertainty  by  allowing  the  user  to 
indicate  if  s/he  does  not  know  the  answer. 
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